By the definition of MC scaling, we want to run the largest problem which doesn't cause memory overflow problem. Suppose originally we process N^2 element on one processor, then with P processors, we should processor (N^2)P elements in total. With this, we can derive the grid size as NP^(1/2) * NP^(1/2). The rest of the calculate/ratio can be done by the formula taught in previous class.

Khryl

Time-constrained scaling actually allows $K$ to be increased by $P$, $K = NP^{1/3}$, from last slides we know that Comm-to-comp ratio is $K / P^{1/2}$, thus now it becomes $N / P^{1/6}$.

Intuitively, larger grid means more elements inside a processor's block, and we know area-to-perimeter rate increases when area increases.

thomasts

I'm confused by this. Is this saying that execution time increases as you add more processors?

hofstee

@thomasts Yes, but remember we're doing $N^2P$ sized grids now instead of just $N^2$.

418_touhenying

How does comm-to-comp ratio relate to the implication?

BensonQiu

@418_touhenying: As you scale the number of processors, communication overhead between processors increase.

For problem constrained scaling, the size of the working set amongst each processor decreases as you increase the number of processors. So each processor is doing very little computation on their small working set. The increased communication overhead becomes a more dominant constraint on the performance of the grid solver. Therefore, the comm-to-comp ratio for PC implies that PC scaling is the worst.

For memory constrained scaling, grid size is scaled such that the size of the working set amongst each processor stays the same even as you increase the number of processors. So each processor still has a working set of N^2 elements, and the comm-to-comp ratio for MC implies that MC scaling is the best out of MC, TC and PC.

bojianh

Is there a mistake on the slides for the MC scaling? I think the execution time should be O(N^3 P ^(1/2)) instead.

misaka-10032

@bojiah Take N as a constant here.

yikesaiting

Well, for memory constraint, I think the comm-comp ratio should be 2/N - 4/N or O(1/N)

Memory-constrained scaling:

By the definition of MC scaling, we want to run the largest problem which doesn't cause memory overflow problem. Suppose originally we process N^2 element on one processor, then with P processors, we should processor (N^2)P elements in total. With this, we can derive the grid size as NP^(1/2) * NP^(1/2). The rest of the calculate/ratio can be done by the formula taught in previous class.

Time-constrained scaling actually allows $K$ to be increased by $P$, $K = NP^{1/3}$, from last slides we know that Comm-to-comp ratio is $K / P^{1/2}$, thus now it becomes $N / P^{1/6}$.

Intuitively, larger grid means more elements inside a processor's block, and we know area-to-perimeter rate increases when area increases.

I'm confused by this. Is this saying that execution time increases as you add more processors?

@thomasts Yes, but remember we're doing $N^2P$ sized grids now instead of just $N^2$.

How does comm-to-comp ratio relate to the implication?

@418_touhenying: As you scale the number of processors, communication overhead between processors increase.

For problem constrained scaling, the size of the working set amongst each processor decreases as you increase the number of processors. So each processor is doing very little computation on their small working set. The increased communication overhead becomes a more dominant constraint on the performance of the grid solver. Therefore, the comm-to-comp ratio for PC implies that PC scaling is the worst.

For memory constrained scaling, grid size is scaled such that the size of the working set amongst each processor stays the same even as you increase the number of processors. So each processor still has a working set of N^2 elements, and the comm-to-comp ratio for MC implies that MC scaling is the best out of MC, TC and PC.

Is there a mistake on the slides for the MC scaling? I think the execution time should be O(N^3 P ^(1/2)) instead.

@bojiah Take N as a constant here.

Well, for memory constraint, I think the comm-comp ratio should be 2/N - 4/N or O(1/N)