Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2016

0xc0ffee

So this is a tradeoff - trading ILP for multithreaded parallelism?

hofstee

It's more like trading single core performance for multiple cores. We lost a lot more than just ILP.

GGOda

Comparing this slide to the last slide, does this mean that multi-core processors do not use branch prediction at all?

hofstee

Kayvon's fictitious processor, no branch predictor. Modern CPUs, yes they definitely have branch predictors.

sidwad

@0xc0ffee : So I think the limiting factor that forces us to remove the fancy bits such as branch prediction, in order to add more cores is the cost of the cpu.

Let me explain : If we can save costs by removing the parts of the cpu that only cause a $25$% drop in sequential single-core performance, and use that money to add another core instead, we'll be getting approximately a $1.5x$ speedup for a small increment in the cost of the cpu.

fleventyfive

But as far as I know, all modern CPUs will definitely have to have a branch predictor logic! Even though it is not visible in the image here, those modules must be present. However, I do agree that they will not have the same level of sophistication as the single core processors. (these will need to be confirmed though)

hofstee

@fleventyfive yes, but the slide here is not describing an actual processor, it describes Kayvon's fictitious chip. So no branch predictor. The branch predictors in modern chips (even multi core) are very highly sophisticated.

jellybean

I would be curious to know what the ratio of speedup to added transistors for increasing the sophisticated logic compared to adding another core. Would there ever be cases where the increased relative speedup is greater for adding the sophisticated logic rather than another core? I think this might occur for computation that is inherently sequential with data access patterns that are easily predictable.