Slide View : 15-418/618 Spring 2014

Previous | Next --- Slide 5 of 39

jmnash

On the one hand, if we measure the speedup against a parallel version running on one processor, we can measure the scalability of our algorithm as we increase the input size and the number of processors. Doing it this way would allow us to make judgments about how our algorithm itself scales. However, if we want to measure how useful our algorithm is compared to running sequential code, or how "worth it" it is to parallelize the code, we should measure speedup against the best sequential program, because in the real world the choice is not between a parallel program on one processor and a parallel program on multiple processors. The choice is between a parallel program on many processors and an optimized sequential program on one processor.

This comment was marked helpful 0 times.

jon

I would agree that the "best sequential program" should be the real measuring stick. Often, the parallel version of a program running on one processor has a lot of overhead, such as creating elaborate data structures (such as the data parallel particle simulation from lecture 8. The parallel version is usually optimized to work well only when given many workers.

So, it's pretty unfair to compare the "best" parallel program to a possibly terrible sequential program (though it probably makes for nicer graphs.)

This comment was marked helpful 0 times.

aew

As @jon and @jmnash said, I also think speedup should be measured against the best sequential program. If we want to see the speedup obtained by parallelizing the code, rather than the speedup obtained by adding more processors, we should compare against the sequential version. There can be a lot of overhead involved in the parallel version, even if we're only using one processor. Also I think this approach is better because it is unlikely we will run the parallel version on one processor. However, if we want to measure how our algorithm scales when we add more processors, comparing it against the parallel algorithm on one core seems reasonable.

This comment was marked helpful 0 times.