Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

top

This would not work. From the example we can see that the second barrier would not work as the flag has already been set by the first barrier and was not reset.

arjunh

@top It's a little more subtle than that. Note that the first thread that reaches the second barrier will also clear the flag (the counter is reset by the last thread that reaches the first barrier).

We'll talk more about barriers in the next lecture, but it would be a good exercise for someone to provide a clear, convincing trace to explain why this barrier implementation could fail.

scedarbaum

Consider two threads, T0 and T1 (so P = 2). T0 enters the first barrier, increments the counter (num_arrived = 1, and spins in the while loop waiting for T1. T0 is then descheduled. T1 now enters the first barrier, increments the counter (so now num_arrived == p), resets the barrier's counter and flag, enters the second barrier, and, finally, sets the flag back to 0. This all occurs before T0 had a chance to check its loop conditional, so it missed the window where flag == 1. Now, T0 is stuck in the first barrier and T1 is stuck in the second barrier, both waiting for the other to finish.

Berry

@scedarbaum Reiterating on your answer threads only check if everyone has made it into the barrier but not that they have made it out - at which point the flag has a value that a straggler thread will misinterpret. The next slide introduces another counter to ensure everyone leaves the barrier before progressing. The slide after that relies on flipping the value the flag is compared to when testing if everyone has made it into the next barrier to avoid the issue seen here as well as the two counters from the next slide.

jiajunbl

There is also a possible trace where a thread returns before all processors have hit the barrier in the 2nd iteration.

Consider T1 and T2 call barrier, and T1 is now after if (num_arrived==p), and T2 returns. T2 then calls barrier again, and is waiting on the while loop. T1 then resumes execution and sets flag = 1, and bad things happen.

afzhang

The flag field should be padded to sit on its own cache line since each processor is going to be reading its value in a loop (the while (b->flag == 0); line), so doing so would reduce thrashing.

yuel1

@afzhang to add to your point, it would reduce thrashing because for the majority of the time, flag can be in the shared state, but if it sits on the same cache line as counter, then it will be in read exclusive state (due to the fact that we increment counter frequently) most of the time, making it hard for the reads to happen with low latency.

abist

@yuel1 adding to your comment, this is the something similar to the optimization we made in test and test and set method from the test and set method for locks.

Also. another question - why do we need the flag?

gryffolyon

@abist, flag is required to block the threads in the barrier and also release them from the barrier once the condition is met. We can also use conditional variables to implement barriers. In case you are interested in conditional variables, look at the 15410 slides on it here:

Conditional Variables

vrazdan

So isn't the error present here due to the fact that the code is not reentrant? Because there are variables in the function that are not independent from calls (a necessity almost because of the nature of locks), the trouble arises when different calls modify variables that other instances of the function rely on. I guess most faulty treaded code is due to reentrant errors.