Previous | Next --- Slide 24 of 29
Back to Lecture Thumbnails

This works since it introduces the concept of "odd/even" barriers, and since two consecutive barriers cannot be even (local_sense == 0) and odd (local_sense == 1), we won't have the issue from the previous slide.


This wouldn't work if it were somehow possible for threads to be in three barriers at once. But that is impossible because it takes every thread to exit a barrier.


I was thinking of a slightly different way to implement this. Instead of requiring a per-processor variable, my thought was to add a third state to the flag. It effectively does the same thing (assuming I didn't make a mistake), but I thought I would share since it was my first thought for how to fix the original buggy version.

// barrier for p processors
void Barrier(Bar_t* b, int p) {
  while (b->flag == 2);  // wait until everyone leaves the last barrier
  if (b->counter == 0) {
    b->flag = 0;  // first arriver clears flag
  int arrived = ++(b->counter);
  if (arrived == p) {  // last arriver sets flag
    b->flag = 2;
  else {
    while (b->flag == 0);  // wait for flag
    int remaining = --(b->counter);
    if (remaining == 0)
      b->flag = 1;

I also thought of a different way to implement this, but my goal was to remove the locks. I think I succeeded in removing them with calls to atomicCAS

void Barrier(Bar_t *b, int p) {
  local_sense = (local_sense == 0) ? 1 : 0;

  bool changed = false;
  int guess = b->counter;

  // Keep trying to change the counter until success
  while (!changed) {
    int actual = atomicCAS(&b->counter, guess, guess + 1);

    if (guess == actual) {
      changed = true;
    else {
      guess = actual;

  if (guess + 1 == p) {
    b->counter = 0;
    b->flag = local_sense;
  else {
    while (b->flag != local_sense); // wait for flag

What happens when I call
- sets local_sense of p processors to 1,
- sets flag to 1,
and right away
- sets local_sense of p processors to 0,
- sets flag to 0,
Is there a chance that some processors in first barrier call have not yet stopped spinning and the second barrier finishes using only 1 processor, setting flag to 0 therefore deadlock for processors from first barrier? Or am I missing something?


@raphaelk: I think barriers are always for a "group of processes". The usage you suggest might not be a correct use of an instance of a barrier. This can lead to deadlock
As per my guess, if you do want a second barrier only for a subset of processes (different "p"), then you should define a different barrier variable "b2" and call barrier(b2,1).
Example code: (Assume 8 processes)

//Common code
barrier(b,8) //All processes should reach this barrier
if(pid % 2 == 0) {
    //Code specific to even pids
    barrier(b_even,4) //Barrier for even pids
    //more code specific to even pids
} else {
    //Code specific to odd pids
    barrier(b_odd,4) //Barrier for odd pids
    //more code specific to odd pids