Slide View : 15-418/618 Spring 2014

Workload-Driven Performance Evaluation

Previous | Next --- Slide 21 of 39

Back to Lecture Thumbnails

taegyunk

I am a little bit confused about above Time-constraint scaling example.

Why scaling the grid size to be k * k would result in speedup k^3/p?

This comment was marked helpful 0 times.

kayvonf

Time-constrained scaling means that as more processors are used the problem size is also increased so that the overall runtime stays the same. We know the problem takes $O(N^3)$ time on one processor, so when working on a larger grid with P processors, we need to select a grid size $k$ such that $k^3/p = n^3$. (This assumes that the P-processor version of the code perfectly uses the P processors... e.g., gets perfect speedup)

This comment was marked helpful 0 times.

pingshaz

I believe that the Comm-to-comp ratio should be computed not as (elements computed per process) / (elements communicated per process), because elements computed per process is not the same as (less than) the work done in each process. In this example, in general if we have a potentially scaled up grid of k by k elements, every processor works with b = O(k^2 / p) elements and takes t = O(k) while communicating c = O(k / sqrt(p)) elements. CtCR should be c / (b * t) = O(sqrt(p) / (k^2)). In the case of Problem-constrained scaling, we have k = n, which gives us CtCR = O(sqrt(p) / (n^2)). In the case of Time-constrained scaling, we have k = n * p^(1/3), which gives us CtCR = O(1 / (n^2 * p^(1 / 6))). In the Memory-constrained scaling case, we have k = O(n * sqrt(p)), and CtCR = O(1 / (n^2 * p^(1 / 2))). If we treat n as a constant, we have O(sqrt(p)), O(p^(-1 / 6)) and O(p^(-1 / 2)) as the respective CtCR's for this example.

This comment was marked helpful 0 times.