Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

A Basic Snooping-Based Multi-Processor Implementation

Previous | Next --- Slide 30 of 57

Back to Lecture Thumbnails

kayvonf

I want to see someone explain why this is a livelock situation.

iZac

When P1 has the cache line in M state, P2 thinks that the cache line is modified and thus issues BusRdX. But P1 has not modified the line yet and was going to do so, but by that time the cache line gets invalidated by P2. Similarly, when P2 get cache line M state, P1 thinks P2 has done its work and issues BusRdX and even P2 is not able to modify the cache line. This just keeps going on if we do not make the processor, who obtained exclusive ownership, complete.

The only result of such a system would be flying of cache line from one processor to another. No body is able to write to it, that is, nobody is able to do any productive work, which is a livelock.

Kapteyn

What do we mean by "allowing the write to complete"? In class we've talked about how a write is "committed" as soon as all caches acknowledge the BusRdx and invalidate their lines.

So in this livelock case, the write has committed by our definition, but no real progress has been made because the cache that owns the line hasn't had a chance to update the value of that line in its cache yet. It seems then that "allowing the write to complete" means waiting for the cache that owns the line to update the line so that when a contending cache issues a BusRdx, the modified line will get flushed to memory.

Assuming this is what we mean by allowing a write to complete, I'm thinking a reasonable implementation of avoiding livelock would be to have an extra bit in the cache line state to mark whether or not an invalid line has been updated in the cache that currently owns it.

To be more specific, we would have something along the lines of:

cache 1 gets line A in read exclusive state
cache 2 invalidates line A in its cache, line is marked as invalid but not yet updated in the cache that owns it
cache 2 wants to send a BusRdx for line A but checks the status of the line, sees that it's invalid but not yet updated and holds off on issuing the BusRdx (puts it in a write buffer)
cache 1 updates line A in itself and broadcasts a message to all other caches to let them know that it has updated the value of line A
cache 2 changes the state of line A in its cache to invalid and updated
cache 2 can now issue its BusRdx for line A and does so
cache 1 snoops the BusRdx and flushes the updated line A to main memory
cache 2 gets the updated line A and now has it in read exclusive state

The cost of this implementation is the extra space (1 bit per line) needed to store this additional state and more significantly, the extra coherence traffic of a cache telling all other caches that it has updated a line every time it updates a line.

Kapteyn

Can a livelock situation also happen if we have P1 issuing a BusRdx and P2 issuing a BusRd?

I'm not sure but I think the answer is yes. When a processor has a line in the exclusive state, if it snoops that another processor has issued a BusRd for that line, then it must flush that line to main memory. P1 could be unlucky enough that it didn't actually get a chance to modify the line before it was forced to flush because of P2.

Unlike the example in the slide it is not immediately obvious why this would result in livelock because we don't have two processors trying to write. Couldn't P1 just suffer the unnecessary flush one time, wait to get access to the bus, send out another BusRdx, and then update the line?

But we could very well have a livelock situation even when one processor is writing and the other reading the same cache line.

take for example this situation that uses a barrier:

processor 1 runs the following code:

def set_flag() {

flag = 1;

}

do_p1_stuff();

set_flag();

processor 2 runs the following code:

while (flag = 0) {}; // essentially a barrier

do_p2_stuff();

(assume flag is initialized to 0)

Say P1 has is executing set_flag() and issues a BusRdx for flag. But before it gets to set flag = 1, it snoops a BusRd from P2 for flag. So it flushes the cache line containing flag which still has flag set to equal 0. P2 reads flag = 0, and stays in it's while loop. Then P1 issues the BusRdx again since it didn't actually get to update flag. But again, before it can set flag = 1, P2 issues another BusRd for flag since its spinning in its while loop, and the cycle continues.

kayvonf

@Kapteyn: Your comment "It seems then that allowing the write to complete means waiting for the cache that owns the line to update the line so that when a contending cache issues a BusRdx, the modified line will get flushed to memory" is correct.

However, a simpler way to avoid livelock in the coherence protocol in your example is the following: Cache 1 will not report a snoop result (or equivalently, will NACK cache 2's BusRdX for line X) until it's own committed (but not completed) write to X completes.

sanchuah

The live-lock situation is similar to the live-lock happens in pessimistic detection of transactional memory. Both threads want to write to the same place, and then they restart each other continually.