This reminds me of how they handle concurrency at Five Guys Burgers and Fries.
The doors and entrance ways to the store are implemented by a standard lock, because only one customer can fit through the door at a time.
The initial waiting line is implemented by a dynamic work queue.
Random people come into the store hoping to aquire the one of the cash register employee's services. They wait in a queue to be served and then are dynamically routed to the first available cashier.
Once a customer is being serviced they place their order and then they are assigned a unique order index, which constitutes their ticket for the upcoming ticket lock based food distribution system. In order to give every customer a unique order number, the computer system needs to do an atomic read and increment on an ever increasing counter.
Customers enter a random access waiting structure called the "tables" and "chairs". Customers enter a lock so as to not disturb the other pleasant patrons and wait for their ticket to be served.
Whenever an order is ready, the Five Guys employees implement cache coherence by informing all of the customer processors that their now_serving set now contains the given ticket value. The food distribution zone has a finite capacity, so it is important that the customer processors have low latency in responding to the updated servicing state an quickly retrieve their food.
The customer returns to the "tables" and "chairs" and consumes their food. The customers sometimes notice false sharing where one customer takes up an a table fit for 6 and due to the timidity of typical customers, they do not want to sit at the same tables as strangers.
Customers perform busy waiting locks while waiting for the limited number of stalls in the restroom facilities and dynamically assign themselves to the first stall that opens that fits their needs. Customers then must wait in yet another queue to wait to aquire the sink to wash their hands.
Exiting the store is implemented in a very similar way to entering the store, except in general priority is given to those customers coming into the store, because it might be a cold day in Pittsburgh outside.
Just to clarify, in this case, there is only one invalidation per lock release and this message is sent to P processors. Hence the traffic is O(P) where as in case of test-test-and-set, O(P^2) is the worst case scenario, that is, when a lock is released, it sends P invalidation messages. All the P processors execute test-and-set each invalidating the cache of the other P-1 processors.
Can someone explain me how is it different from test and test and set and why the interconnect traffic is O(P) and not O(P^2)?
What i think in case of ticket lock, we are looping over a single variable (i.e. l->now_serving) and each release will invalidate this in all the processors. So effectively it is same as test and test and set.
It's O(P) since, like @flyne said, once the lock is released, the worst case is when all the processors have the lock in cache and there will be invalidation on all P processors.
The difference between the two algorithms is that in 'test and test and set' all processors will try to grab the lock so the traffic will be O(P) as opposed to ticket lock where only one processor will grab the lock - so the traffic will be O(1).
I agree with @flyne and @aznshodan. In 'test and test and set', upon lock release all the processors will try to grab the same lock generating O(P^2) invalidations. But in this case, a lock release will generate O(P) invalidations in the worst case and since acquiring a lock is only a read operation no more invalidations occur.
In OS, we talked about 3 criteria to a lock:
The ticket lock ensures bounded waiting since every thread is queued.