Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2016

PIC

While this type of lock guarantees a strong sense of fairness or "bounded waiting", it comes at a cost. This sort of lock can cause higher latency because we have to wait for the thread whose turn it is to be scheduled. If all that matters is through put and this lock is protecting something like a work queue a spin lock might be more appropriate.

sasthana

Is this the bakery algorithm?

PIC

@sasthana close. The only difference is that the bakery algorithm is generalized for situations where getting a unique ticket is not trivial, such as distributed systems.

eknight7

Just to get an idea of cache coherence traffic for this lock:

Suppose Proc1 has the lock and l->now_serving == 1. Suppose Proc2, Proc3, .. need the lock. Each of them will get a ticket by incrementing l->next_ticket. For this, Proc2 invalidates cache line of l->next_ticket from all other procs by sending BusRdX. (So struct must be written with proper padding such that l->next_ticket and l->now_serving are on separate cache lines). Proc3 invalidates cache line of l->next_ticket from Proc2, increments and so on. So while waiting for the lock, there are O(P) invalidations of the l->next_ticket cache line.

After getting the ticket number, Proc2 keeps reading l->now_serving. When l->now_serving == 2 (Proc1 unlocked), Proc2 will do its work in the critical section, then invalidate Proc1's cache line for l->now_serving when Proc2 unlocks. Similarly, Proc3 will only invalidate Proc2's cache line for l->now_serving when Proc3 unlocks and so on. So for every Proc, there is 1 invalidation of l->now_serving => there are O(P) invalidations of l->now_serving when all P procs want the lock.

In fact, we can say there are 2 cache line invalidations per proc that wants the lock. One for l->now_serving and one for l->next_ticket. This is O(P) invalidations total for P procs that want the lock.

colorblue

As we learn later in lecture for our problem space the ticket lock is not the ideal implementation. The ticket lock is good for ensuring a fair order in obtaining the lock and for low overhead. However when we are using a ticket lock we spin wait on until our number is the next to be called. Doing this means we are using cycles we don't have to be using and causing invalidations and acquisitions in the caches of the processors.

Richard

I still don't get what problems ticket lock approach may face... Yes, it wastes time when waiting for its ticket to be called, but previous implementations (in previous slides) also require waiting, since they all seem to be "busy waiting" synchronization.

ZhuansunXt

I have the same question as @Richard. The main problem the ticket lock solves is providing provision for fairness. But can we claim the latency is increased due to this "waiting in order"? I think there's no starvation here and every processor is guaranteed to acquire the lock after expected time span, which is not necessarily longer than spinning.

BensonQiu

@Richard, @ZhuansunXt One thing I can think of: If the lock is unlocked, but the next thread in line (my_ticket == l->now_serving) to acquire the lock is descheduled, then latency will increase because no other threads in line can acquire the lock.

xingdaz

How is this different from spinlock in terms of performance? Isn't it still occupying CPU compute resources? Every waiting threads is still busy waiting on the number being called. Only this time, the thread can't grab someone else's number.

Renegade

I think the main contribution of this ticket lock is removal of t-s attempts whenever the lock, which is contended across all threads, gets released. Here, only the thread that owns the lock can invalidate "now_serving", all others are just share-reading it.