Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

Previous | Next --- Slide 25 of 64

axiao

This algorithm works as follows:

Suppose for simplicity we have 2^k elements. Then we iterate from i = 0 to i = k - 1, and on each iteration for each index j set A[j] += A[j - 2^i] (as long as not out of bounds).

This means we maintain the invariant that at the end of iteration i the element at position j is equal to sum(j - 2^(i+1) + 1, j), the sum of all elements from index j - 2^(i+1) + 1 to index j (cap the lower bound at 0).

To see why this is inductively true: at start of iteration i, A[j] = sum(j - 2^i + 1, j) (right after end of iteration i - 1). Now we set A[j] as follows:

A[j] = sum(j - 2^i + 1, j) + A[j - 2^i]

= sum(j - 2^i + 1, j) + sum(j - 2^i - 2^i, j - 2^i)

= sum(j - 2^i + 1, j) + sum(j - 2^(i+1), j - 2^i)

= sum(j - 2^(i+1), j)

As desired by end of iteration i. This holds true for any operation, not just sum.

Cake

Shouldn't this only hold for associative operations? I might be forgetting my 210 here..

eourcs

Yes, this scan implementation would only be equivalent to a sequential scan implementation for associative operations. In a sequential implementation, A_0-2 = (((A_0) + A_1) + A_2), but in this implementation, A_0-2 = (A_0 + (A_1 + A_2)). This is kind of interesting to note since floating point addition is not associative, so this implementation would not be correct (well, maybe close enough) for floating point sum!