Previous | Next --- Slide 12 of 36
Back to Lecture Thumbnails
chaominy

What is the maximum possible speedup of a parallelized program? Ideally, we would expect the speed-up from parallelization to be linear. However, most of them could not have a near-linear speed-up when the processors number grows large. The potential speed-up of an algorithm on a parallel computing platform is given by Amdahl's law. It states that a small portion of the program which cannot be parallelized will limit the overall speed-up available from parallelization. In other words, the speed-up of a program from parallelization is limited by how much of the program can be parallelized. For example, if 80% of the program can be parallelized, the theoretical maximum speed-up using parallel computing would be 5x no matter how many processors are used.

Reference: wikipedia -- Amdahl's law. http://en.wikipedia.org/wiki/Amdahl's_law

lkung

Is 2x speedup on 10 processors a good result? It depends on what your program does.

The example given in class was a 2x speedup on 6 processors for a computer game. If it had previously been running at 15 fps, the speedup would be significant in this situation, since it would render the game playable.

bscohen

@lkung "It depends on what your program does"

My initial thought is that it has nothing to do with what the program does, and everything to do with what is the associated cost of scaling to 10 processors. Does it require a significant refactoring of code? How large are the energy / hardware costs for the extra processors?

10s -> 5s, 1hr -> 30 min, and 2 days -> 1 day is always a win. The question is simply "at what cost?"

lkung

@bscohen Yes, I think that's a better way to phrase it. "It depends on the cost." Of course, the cost depends on what your program does. For example, it would be worth the high cost of refactoring to make a game run faster, and therefore playable. However, the same refactoring cost may not be worth it for another program that doesn't have the same constraints.

unihorn

Efficiency of communication between processors should be counted on both software level and hardware level. In general, there are many works on hardware design to improve it, such as processor affinity. But when I look at new technology, such as MapReduce, it mostly depends on the software to complete it. What has the hardware supplied, or what can the hardware supply for synchronization/communication? What degree can it supply compared to the software?