Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2016

kipper

Notice the difference between having a lock around the entire stack and this lock-free version: Before, if a thread holds a lock for some part of the stack and if that thread was ripped off the CPU, then no other thread can make progress. But with the lock free stack, a thread "being" in the data structure doesn't mean it has exclusive access. Instead, the atomicity of a sequence of operations is ensured by the CAS check. As we see on the next slide, this implementation is subject to the ABA problem.

chuangxuean

The CAS is required here and not in the previous example of queues because for a stack, the popping and pushing are both on the top of the stack but for queues, they occur at opposite ends of the data structure.

msfernan

In the push function in the syntax for compare_and_swap(&s->top, old_top, n) if the pointer to node s->top == the pointer to node old_top, then swap in the value of n for old_top.

In the pop function in the syntax for compare_and_swap(&s->top, old_top, new_top), if the pointer to the node s->top == the pointer to the node old_top then swap in the value of new_top for old_top.

I don't understand why you have an &s->top. Why don't you simply pass in s->top ? Is this just an implementation detail?

kayvonf

@msfernan. compare_and_swap() takes as input a memory address (hence the pointer to s->top) the expected value stored at that address, and a new value to store in the address is the compare-and-swap succeeds.

msfernan

Thanks @kayvonf. That makes sense. My friend directed me to an implementation for compare_and_swap online. It helped me understand better.

bool compare_and_swap(Value* addr, Value oldVal, Value newVal){

  if(*addr == oldVal){
    *addr = newVal;
     return true;
  }
  else{
    return false;
  }

}

kayvonf

@msfernan. Correct!

huehue

If compare-and_swap returns a bool, why do we check if its return value is equal to Node* old_top?

xiaoguaz

@huehue Good point. I think we should remove the " == old_top" in this code, too.

leis1

@kayvonf As an abstraction, CAS to me is still sort of lock or blocking, CAS guarantees that it's atomic and that CAS(&addr;, 4, 5) and CAS(&addr;, 4, 6) won't return true at the same time. The lock here for me is the finest-grain level and the spin here includes the modification of a data structure therefore we call it's making "progress" and non-blocking.

Thus I'm a little confused here. One reason we say using CAS is lock-free could be that the lock doesn't happen at c language level. Am I making sense here?

PandaX

@leis1

I think it is lock-free mainly because the thread is still making progress (on some work we created)

Although I didn't see any advantages of using lock-free stack here. Maybe it potentially saves the time of using locks which may involve swapping out threads by OS.

msfernan

@PandaX, I think it is lock-free because at the time at which the CAS happens, the systems as a whole has made progress in some way.

If the CAS failed, then that implies another thread made progress. If the CAS succeeded, then the current thread made progress. Therefore, the system as a whole always make progress.

As to the advantages of using a lock-free stack I too think there may not be that many advantages. I found a paper that says two common lock-free stack algorithms do not perform well except at high thread counts. The paper combines the two algorithms to give a more scalable one. http://people.csail.mit.edu/shanir/publications/Lock_Free.pdf.

The practice exam 2 also has a question in which one has to reason that stacks are not the best data structure to code up as lock-free primarily because all threads are trying to modify the top and so cause contention.

Renegade

From my perspective, what lock-free/CAS achieves here is to make threads never blocking each other. One way to think about this is: if using lock (at any level), thread_0 holds the lock and for some reason gets swapped out by OS, which blocks all other threads that try to operate on the stack indefinitely. In this lock-free implementation, CAS ensures threads can proceed without blocking one another.

RX

I think the essence of lock free data structure is that "there's always someone who is making progress"

RX

What is the problem here if the memory system is not sequential consistent?

blairwaldorf

@RX: If it isn't sequential consistent, then the assignment of those array pointers can be done in any order, and we wouldn't proceed along the linked list as expected.

RX

@blairwaldorf wouldn't Mutex has the same problem, what is special about lock free implementation relies on memory ordering?

jsunseri

@RX yes, using a mutex would have the same problem. I don't think he was implying it was a problem specific to the lockfree implementation by noting it on the slide.