Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

Previous | Next --- Slide 20 of 30

jiajunbl

If we are thinking about an architecture with least consistency principle would we need memory fences to prevent blind read/writes from happening?

rojo

Having many memory fences would probably be a bad idea as we are stalling the program execution. This would lead to nearly linear execution of code. Perhaps there is a finer grain of control.

gryffolyon

Aren't fences already implemented inside atomic_cir_increment() ? For eg from what I have learn, in intel we have atomic instructions like xadd, xchg, cmpxchg which inherently use fences in them to achieve consistency

lament

Notice that we padded the elements of the pad status so that when we update an element of the array, it won't cause there to be cache-coherence traffic with other processors. In other words, if we did not pad the elements, then we would risk having multiple locks fit in a cache-line, resulting in false-sharing despite different threads of control on different processors accessing different locks.

Kapteyn

I think we have to check that the circular increment doesn't cause any two processors to have the same index, which can be avoided by simply making the id modulo the number of processors.

toothless

what about doing an ordinary atomic_increment(&l->head) and modding by P afterwards? if P is a power of two (say 16) then modding becomes a bitwise AND operation, which is really cheap.

VP7

Since the given piece of code is just a C-like pseudo code i would assume that "Atomic increment" implicitly mean that the atomic modify will implement fences to ensure consistency.