Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

jmc

The ticket lock reduces interconnect traffic by a factor of P over the basic test-and-set because when the lock is released, only the processor with the next "ticket number" will send a BusRdX and invalidate the other caches, rather than all P processors in sequence.

CORRECTION: The ticket lock reduces interconnect traffic by a factor of P over the test-and-test-and-set, not the basic test-and-set as initially written above. It would be an even larger factor over the basic test-and-set.

axiao

The ticket lock also avoids the interconnect traffic that occurs in basic test-and-set when a lock is acquired by one processor and other processors continuously try to acquire the lock for themselves. This is because the only write operation in the ticket lock implementation is when a processor unlocks a lock. Each processor only needs to read the now_serving variable while waiting.

rootB

Ticket lock reduces interconnect traffic at the cost of an extra int for lock storage. Is there any other tradeoffs?

nemo

There will be false sharing in this case! The next_ticket increment (write operation) will invalidate the line in caches where the corresponding processors may just be waiting (cache line in S state).

ask

Another advantage of the ticket-based lock system is that it ensures fairness. None of the threads contesting for the lock is starved as they are assigned a ticket number which more or less ensures a FIFO kind-of behavior.

eourcs

It also easily extends a priority system due to this centralized structure. If two threads of different priorities are waiting to grab a lock (this is sometimes pertinent in real time systems), they can simply be added to a priority queue. Such a solution would not so simple with other forms of synchronization.

Master

Ticket-based spin lock has serious problem of false sharing.

When a thread tries to acquire a lock, it has to read from the core that is holding the lock.

When a thread releases a lock, it has to broadcast invalidation messages on the interconnect to invalidate all cores that holds the cache line.

shpeefps

Is false sharing actually an issue here? Even though every processor's cache is invalidated with each increment, they weren't going to do any useful work while waiting anyway. Or am I missing something?

vadtani

@shpeefs with padding you will be invalidating only that processor's cache line, but without it all processor sharing the cache line will be invalidated.

annag

The ticket lock model successfully accomplishes the goal of "low interconnect traffic." This is because a sequential assignment of lock acquisition is assigned to the processors, reducing the need to decide which processor acquires the lock after it's been released.

rrp123

When a processor releases the lock, wouldn't there still be O(P) invalidations that happen, since every processor has to drop it's version of l->now_serving? Similarly, whenever a processor modifies the next_ticket number, every other processor would still have to drop it's version of this variable as well, causing O(P) invalidations once again. How is this better than the test-and-test-and-set lock?