Previous | Next --- Slide 23 of 30
Back to Lecture Thumbnails
smklein

The problem with this implementation of the barrier is that although it ensures all threads WAIT up to some point, it does not ensure that all threads LEAVE.

If ANY threads are left behind in the while(b->flag == 0) statement (keep in mind, b->flag will actually be one, but they will not have had the chance to evaluate it yet), and ANOTHER barrier is called, the flag will be reset to zero.

Some threads will be stuck on the first barrier, and some threads will be stuck on the second barrier. All threads will be stuck.

squidrice

An alternative solution is to use a different barrier variable. If a thread hits the second barrier, it would update an independent flag and there wouldn't be a dead lock. Note that the solution on slide 25 uses two flags conceptually by flipping the value of flag.

retterermoore

The exact interleaving of processor steps that would lead to the deadlock would be:

processor 0 reaches barrier 1 and sets b flag to 0, then begins waiting processor 1 reaches barrier 1 and sets b flag to 1 processor 1 does the code in between barriers processor 1 reaches barrier 2 and sets b flag to 0, then begins waiting

Now both processors are waiting for the b flag, but they're waiting at different barriers, so neither barrier will ever reach the threshold of 2 processors to set the b flag to 1.

A possible solution to this would be to somehow force the while condition to be checked every so often, or alternatively we would have no problems if the work between the barriers is expensive/lengthy enough that processor 0 will definitely get a chance to check before processor 1 finishes the work. But this is certainly completely invalid as a general-purpose barrier or as a black box barrier implementation.

aaguilet

I was thinking that another way to solve the problem would be: Instead of making the last process set b->counter = 0, just set flag = 1, then each process would need to decrease counter by one right after leaving the barrier. Now, the next time we enter the barrier and set flag = 0 again, we are guaranteed that all the processes left the previous barrier.

One drawback of this approach is that we would need to use a lock again; to update the counter, or we can use an atomic add to avoid the lock. But even if we use atomic add, we would be generating interconnect traffic by invalidating the cache line containing the counter.

sluck

Just as a refresher, for the line int arrived = ++(b->counter), it is important that the increment operator ++ comes before (b->counter) because that ensures that arrived is set to the same value as b->counter after the increment. If instead we had int arrived = (b->counter)++, then arrived would instead be set to the value of b->counter before the increment, which would result in all processors waiting forever for everyone to arrive at the barrier, even when each processor has already arrived...