Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

Workload-Driven Performance Evaluation

Previous | Next --- Slide 24 of 43

Back to Lecture Thumbnails

Cake

Can someone explain how the comm-to-comp ratio here is fixed at 1/N?

bochet

@Cake

In memory constrained case, each processor still processes a block. The block is N by N, thus N^2 elements.

The computation cost is N^2 (elements in block); The edge of each 2d block needs to be communicated, so the communication cost is O(N). So the comm-to-comp is fixed at N / N^2 = 1/N

bochet

For the implication part, TC and PC's comm-to-comp increases with more processors, so more computation overheads. MC's comm-to-comp is constant, so not affected by the number of processors.

unparalleled

In the problem constrained scaling, the communication to computation ratio increase by a factor of sqrt(P) with the increase in number of processors. But in time constrained scaling since we also scaling up the problem size, the communication to computation ratio increases much slowly as seen above.

Implication: Linear scaling(ideal) is more probable in time constrained scaling than in problem constrained scaling.

machine6

To summarize PC, TC and MC:

PC: Fix the problem size, increase the number of processors.

TC: Fix the execution time, increase the problem size.

MC: Fix elements computed per processor, increase the problem size, increase processors which increases memory capacity.

machine6

What I don't understand is how can there be an increase in execution time if the number of elements being processed is fixed? Can someone clarify?

EDIT: Is it because work does not increase linear to memory usage?

pk267

It can also be seen intuitively that TC is easier than PC, because for a given increase in number of processors, the increase in work done is cubic. Thus, lesser number of processors could achieve the same speedup. And so it is easier.

Please correct me if I am wrong!