Although the number of barriers can be changed, here is their rationalization for this program:
Barrier 1: Needed, otherwise a thread could add to diff in the line
diff += myDiff, and a different thread could reset the global diff to zero in the line
float myDiff = diff = 0.f immediately afterwards.
Barrier 2: Needed, otherwise a thread may determine that it is done with an unnaturally low diff which doesn't have the contributions from other threads.
Barrier 3: Needed -- other threads must compute if 'done' is "true" or "false" based on the line if (diff / (n * n) < TOLERANCE) before anyone has the chance to set the global diff back to zero at the beginning of the while loop.
This comment was marked helpful 5 times.
spilledmilk
I believe the code can be slightly rewritten so that only the first two barriers are needed. The code after the for loop should look like:
This essentially sets done := true by default and then checks in the if statement if it should be set to false. The difference between this and the given code is that after the second barrier, if diff/(n*n) < TOLERANCE, then each thread skips the new if statement and finishes the while loop. However, if diff/(n*n) >= TOLERANCE, the first thread that encounters the if statement will set done := false, and because the only place done is set to true is between the two barriers in the while loop, all threads will continue on to the next iteration of the while loop even if diff is reset to 0 before they process the if statement, removing the necessity of the last barrier in the original code.
This comment was marked helpful 0 times.
bxb
Thinking about spilledmilk's answer, his method is valid assuming that done, stored in global memory somewhere, gets checked by every thread on every access. If done was instead stored in a register I'm not sure it will necessarily work every time. Aside from C's volatile keyword, what other mechanisms for dealing with thread memory accesses/caching are there? (I'm not sure I'm wording my concern properly but hopefully it makes sense.)
Although the number of barriers can be changed, here is their rationalization for this program:
Barrier 1: Needed, otherwise a thread could add to
diff
in the linediff += myDiff
, and a different thread could reset the globaldiff
to zero in the linefloat myDiff = diff = 0.f
immediately afterwards.Barrier 2: Needed, otherwise a thread may determine that it is done with an unnaturally low
diff
which doesn't have the contributions from other threads.Barrier 3: Needed -- other threads must compute if 'done' is "true" or "false" based on the line
if (diff / (n * n) < TOLERANCE)
before anyone has the chance to set the globaldiff
back to zero at the beginning of the while loop.This comment was marked helpful 5 times.
I believe the code can be slightly rewritten so that only the first two barriers are needed. The code after the
for
loop should look like:This essentially sets
done := true
by default and then checks in theif
statement if it should be set tofalse
. The difference between this and the given code is that after the second barrier, ifdiff/(n*n) < TOLERANCE
, then each thread skips the newif
statement and finishes thewhile
loop. However, ifdiff/(n*n) >= TOLERANCE
, the first thread that encounters theif
statement will setdone := false
, and because the only placedone
is set totrue
is between the two barriers in thewhile
loop, all threads will continue on to the next iteration of the while loop even ifdiff
is reset to 0 before they process theif
statement, removing the necessity of the last barrier in the original code.This comment was marked helpful 0 times.
Thinking about spilledmilk's answer, his method is valid assuming that done, stored in global memory somewhere, gets checked by every thread on every access. If done was instead stored in a register I'm not sure it will necessarily work every time. Aside from C's volatile keyword, what other mechanisms for dealing with thread memory accesses/caching are there? (I'm not sure I'm wording my concern properly but hopefully it makes sense.)
This comment was marked helpful 0 times.