Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

Previous | Next --- Slide 21 of 30

Olorin

I'm not sure I see why atomic_increment would go wrong. Here's a rough implementation that I believe is implemented the "same way" as atomic_min, above:

int atomic_increment(int* addr, int x) {
  int old, new;
  do {
    old = *addr;
    new = old + x;
  } while (atomicCAS(addr, old, new) != old);
  return old;
}

I don't see any issues present in this implementation, assuming a correct atomicCAS and ignoring the issue of starvation which is theoretically present in atomic_min as well. Can someone help me understand?

rofer

I'm also not sure why we can't implement atomic_increment as Olorin has done.

However, while thinking about it I came up with what I believe would be a slightly faster atomic_min:

int atomic_min(int* addr, int x) {
  int old, new;
  do {
    old = *addr;
    new = min(old, x);
  } while (new != old && (old = atomicCAS(addr, old, new)) > new);
  return old;
}

This avoids any atomic operation if we know we won't change the value at addr.

arjunh

@Olorin, @rofer. I think we may have given the wrong impression here; your implementations will both do the right thing.

However, a better question might be this:

If compare-and-swap(addr, old, new) returns old, does that imply that no thread has modified the value at address addr, before we made the call to compare-and-swap?

For atomic-increment, this nuance won't cause a correctness issue. However, there may be other operations where this subtlety could have severe consequences.

ericwang

@arjunh, I understand your point that this implementation cannot avoid other modifications between the time of read and the time of set.

But could you please enlighten me about the severe consequences you mentioned? Is it a performance issue? Or there would be correctness issues if we apply this method in some other functions?

arjunh

@ericwang: It's not really the method that could be faulty, but rather the approach. For atomic increment, this approach (use compare-and-swap to check if anything has changed to the value at the designated address) works fine.

However, suppose you're trying to implement a data-structure that uses compare-and-swap to check if another thread has modified the data-structure before the currently running thread had a chance to execute the update (say, a stack). You may use a compare-and-swap to see if another thread has modified the top element of the stack when performing a pop. As we'll soon find out, our simple compare-and-swap won't work so well anymore.

If you're interested, you may want to take a look at the ABA problem, just to get a taste of what we'll look at in the next lecture.

BryceToTheCore

I dashed home tonight excited to post on this slide, because I did not believe the claim of incremental not working. I then saw that others have already posted good counter examples. I then looked up the ABA problem on wikipedia: https://en.wikipedia.org/wiki/ABA_problem.

I read it and was wondering what the problem was and then realized that much of my lock free data structure thinking had been done assuming the presence of a garbage collector. It think it is much easier to code lock-free data structures with an automatic garbage collector, because then if a thread accesses stale nodes, we can mark them as stale and it will not affect the guarantees of a carefully thought out program.

I am still a believer in the presence of simple compare and swap based stacks and queues in the presence of automatic garbage collection. I have not yet thought out implementations involving manual memory management yet.

rbcarlso

The ABA problem is very interesting, and exposes an assumption that we may have been making about this kind of implementation for atomic increment. The implementation here checks to see if the value is the same as it was before, NOT if the value hasn't been changed. Strictly speaking, another process could change the thing a million times as long as it gets back to the original value before the CAS is performed and the swap will still take place. In this case the atomic increment is not actually atomic. In a truly atomic operation, the value will not change at all between the read and the write. In this implementation, the value of the integer will be the same as it would be if the operation were atomic, but the other assumptions that atomicity allows us to make cannot be made here.

The potential side-effects in this course are really starting to add up. Is anyone else starting to miss functional programming?

BryceToTheCore

@rbcarlso

I am not missing functional programming at all. The use of persistent data structures with data that is read only would alleviate our parallelism woes, but would prohibit the applications where lock free data structures are used. The use of purely functional data structures would lead every thread to be operating on structures with initial pointers that are only accessible to the local scope of each thread.

S = A -> B -> C

Thread 1 calls: Pop(S) gets S1 = B -> C

Thread 2 calls: Push(S, D) gets S2 = D -> A -> B -> C

Thread 1 and 2 do not actually have pointers to the same initial structure now and cannot inform other threads about their own updates due to the lack of mutations.

These persistent structures have wonderful uses, but are not to be used in the applications where lock free and atomic structures should be used. They are different use cases.

skywalker

in atomicCAS - shouldn't we be returning *addr? otherwise it always returns the value read at the address regardless of if we swap it or not.

lament

@BryceToTheCore Interesting post, if it is true (which I say not to imply you are wrong, but to indicate that I lack sufficient background knowledge to fully judge its content).

Corian

I think I'm misunderstanding something. In atomic_min, why do we check if the value returned from atomicCAS is not old before trying it again? Doesn't atomicCAS return old if it fails so we want to try again while it's still old?