Previous | Next --- Slide 12 of 30
Back to Lecture Thumbnails
regi

Just to double-check my understanding: this is a problem with high interconnect traffic. Even though P1 is holding the lock, it doesn't hold the cache line, which is continuously bouncing between processors and being invalidated when the test-and-set fails.

wcrichto

Correct. Because each test-and-set requires a read with intent to update (BusRdX), by our MSI protocol we have each test-and-set attempt bouncing the line out of another processor's cache.

dsaksena

In a multiprocessor system, atomic instructions such as test_and_set are bad because they block the bus and invalidate caches but even in a uniprocessor, they are bad as they cause the system to block the DMA.

DMA is a system where the MMU is an arbitrator who decides which device reads or writes to a memory location kind of like our bus arbitrator thus the issues with blocking of DMA (just like blocking of bus) pop here too.

gryffolyon

A bus lock used to be the mechanism used. Nowadays, we use LLSC. Load Linked Stored Conditional to implements these which does not block the whole cache line

sanchuah

I found a link in wiki describing LLSC, but I am still not sure what it is.

ananyak

Why can't the test and set operation first request for read access, and then request for exclusive access if and only if the variable being tested is in fact 0? That would avoid the need for a test and test and set lock.

I would guess that it's because sending 2 requests would increase the latency, but I'm not sure if this answer is right.

parallelfifths

I thought Kayvon's comment from lecture was particularly helpful in summarizing this slide-- even though Processor 1 has the lock, it doesn't (necessarily) have the cache line that the lock pertains to. The lock is not locking the ability of other processors to attempt updates; it's only locking the ability of other processors to complete those updates.

chenliu

@ananyak Imagine if you have P1 and P2, P1 first requests for read access, finds that the value is 0, and P2 requests for read too, getting a 0. Suppose P1 requests for BusRdX before P2 does so. Then the cache line in P2 will be invalidated. However P2 has already done checking the variable value and thinks that it can grab the lock, so it will proceed requesting a RdX.

After this both P1 and P2 are in the critical section.

Zarathustra

It is useful to remember, for this slide, that BusRdX means "acquire with INTENT to modify", but it does not mean that the processor issuing the request WILL modify once it gets the line, i.e., test-and-set fails since the requesting processor read a 1, so no modification occurs.

jcarchi

@sanchuah from what I can see LL/CS basically does the same thing as CAS but has some desirable properties CAS does not. One thing is that it is not susceptible to the ABA problem