Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

williamx

When there are fewer processors, can we redesign the code so that it makes better use of the cache? If so, this might generate a more accurate speedup plot.

rohany

Can superlinear speedup also come from not considering things like SIMD or hyperthreading introduce more multiplicative total possible speedup?

muchanon

In this graph we get speedup beyond the ideal, but why do we not consider the ideal to already include these benefits? I don't know the answer to @rohany's question, but if we had hyperthreading benefits possible, shouldn't we already consider this to exist in the ideal?

googlebleh

@rohany It depends on what you define the x-axis to be. In this case, it's number of processors, so technically SIMD would produce a superlinear speedup (since it's utilizing more ALUs, but on the same number of processors). Hyperthreading seems like a borderline case when deciding if that constitutes more processors or not.

However, if you're trying to judge how well you've written a parallel algorithm, then that doesn't make for much of an accomplishment. The graph is just a reference point you lay out so you can benchmark your own code, so consider the appropriate factors when determining a baseline :)

lfragago

@muchanon, I think that the ideal line should only account for the theoretical speedup in terms of the number of processors. Creating an "ideal" line considering memory accesses is something very tricky to do, because it would change for every architecture having different cache working set (even with the same number of cores).

kayvonf

@googlebleh. I agree with your post in general, but I'd like to clarify that we shouldn't call the speedup superlinear with SIMD. I'd rather we say that linear speedup is when we observe a performance increase that is linear in the execution resources used. In this lecture, we make the simplifying assumption that number of resources is in terms of the number of processors (or processor cores) P.

However, one thing that I do not agree with is that hyperThreading would definitely not constitute more execution resources (or more cores). Why is that?

In other words, if you have a 4-core processor with two-way multi-threading, what speedup curve are you likely to expect? Looking at your assignment 1 results might give a hint.

eourcs

While hyperThreading does not constitute more execution resources (you still have the number of ALUs per chip), it is possible to get superlinear speedup (if we define the number of processors to be the number of physical cores) by maximizing use of the ALUs since the hardware can run the second thread when ALUs are idle. Whether one should consider two "logical" cores sharing execution resources to be roughly equivalent to two physical cores is hard to say. I'd say probably not.

paracon

Prof Kavyon mentioned in an earlier review session that Intel Hyperthreading is a commercial name that Intel uses to market their chip's simultaneous multi-threading and interleaved multi-threading capabilities. To support this, there are additional execution contexts and ALUs that exists, so I don't understand hyperthreading does not constitute more execution resources.

cluo1

I think it would be more fair to somehow restrict the size of cache and only scale compute power.