Slide View : Parallel Computer Architecture and Programming : 15-418/618 Fall 2016

Holladay

How is this prediction done? That seems like a very difficult problem..

bpr

@Holladay, branch prediction has been extensively studied and while beyond the scope of the course, just using 2 bits per branch can achieve better than 90% accuracy, see the 2-bit saturating counter.

Holladay

@bpr Thanks! Thats impressive!

tcm

As an aside, one of the more surprising discoveries in computer architecture is the the extent to which branch prediction can be improved by correlating branch outcomes with other branch outcomes. This is a way to achieve even better accuracy than a 2-bit predictor. In case you are curious, here are some papers to read on this topic:

http://dl.acm.org/citation.cfm?id=225168 http://dl.acm.org/citation.cfm?id=195549

The downside of correlation-based branch prediction is that it requires larger prediction tables. Before 2004, processors had been dedicating more and more space to these types of tables. Now that we are trying to be more area-efficient, however, people have scaled things back a bit.

llcoolj

could we run both the outcomes in parallel and then squash the wrong one later? taking more resources but maybe being faster in certain situations?

bpr

@llcoolj, Yes, it is an interesting idea to execute both branches. This idea was explored in the 90s under the terms "dual path" and "multi-path" execution. One such paper is Threaded multiple path execution.

ferozenaina

Also, it is interesting to know even if we pick a branch randomly (50% accuracy) and execute the 'then' part, we will still have a decent speedup.

I did not know we could achieve 90%+ speedup!