Step 3 is finding the start and end addresses of each cell segment. This step is highly parallelizable because no two threads will attempt to modify the same piece of memory.
Consider index i. The only write to cell_starts[i] happens when result[i] is the first element of that cell block. Since result is sorted, this will occur exactly once. The only write to cell_ends[i] happens when you start a new cell block. Once again, since result is sorted, this will happen exactly once. Therefore, no two threads touch the same index in the result array.
In this particular example, only index=2 will modify cell_starts[6] and only index=5 will modify cell_ends[6].
After Step 3 has run, you can use cell_starts and cell_ends to index into the array of particles which I assume has been sorted with the result array.
This comment was marked helpful 1 times.
kayvonf
@AnnabelleBlue -- that was an excellent explanation of the algorithm and its invariants. Very well done. (We prettied up the code references in monospace using Markdown to make it easier to read.)
Step 3 is finding the start and end addresses of each cell segment. This step is highly parallelizable because no two threads will attempt to modify the same piece of memory.
Consider index
i
. The only write tocell_starts[i]
happens whenresult[i]
is the first element of that cell block. Sinceresult
is sorted, this will occur exactly once. The only write tocell_ends[i]
happens when you start a new cell block. Once again, since result is sorted, this will happen exactly once. Therefore, no two threads touch the same index in the result array.In this particular example, only
index
=2 will modifycell_starts[6]
and onlyindex
=5 will modifycell_ends[6]
.After Step 3 has run, you can use
cell_starts
andcell_ends
to index into the array of particles which I assume has been sorted with the result array.This comment was marked helpful 1 times.
@AnnabelleBlue -- that was an excellent explanation of the algorithm and its invariants. Very well done. (We prettied up the code references in monospace using Markdown to make it easier to read.)
This comment was marked helpful 0 times.