Previous | Next --- Slide 29 of 64
Back to Lecture Thumbnails
chenh1

What does "Add base to elemnt a8 thru a8-11" and "Add base to elements a8-12 thru a8-15" means? From the previous slide, the only exchange of data I find is 0 and a0-7.

williamx

I don't think this is the same algorithm as the previous slide. Since we only have 2 processors, we don't need to use the parallel scan algorithm.

Instead, this algorithm works as follows: In the first iteration, the first thread does scan on elements 0 to 7 completely sequentially, while the second thread does scan on elements 8 to 15 completely sequentially. Now, the output is almost correct, except that we need to add the sum of the first 8 elements to indices 8 through 15. So in the second iteration, the first thread adds the sum to indices 8 to 11, while the second thread adds the sum to indices 12 to 15.

crabcake

Why is constant 1.5?

pdp

@crabcake: We have O(N) work to do two sequential sums and we take the sum of first half of the array to add to every element of second half of the array. Hence O(N/2) work to parse the second half of the array.

The main point being, in the previous slide, we do 2 times O(N) work and span is 2O(lg N). Here, we do 1.5O(N) work.