Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

A Basic Snooping-Based Multi-Processor Implementation

Previous | Next --- Slide 44 of 57

Kapteyn

I'm not quite clear on how each part in the system (caches, memory manager, bus manager) should react in a situation where P1 issues a BusRd for a line that P2 has in the dirty state.

Say the caches obey the MESI protocol. Typically when cache 1 holds a modified line and cache 2 wants to read that line, cache 1 forces the read to back off while it flushes the line to main memory. So I'm thinking the appropriate steps might be something like:

P1 is granted the bus, issues BusRd for A. BusRd is added to table of outstanding requests.
P2 snoops P1 has asked for A and P2 has A in dirty state.
P2 knows it must flush line A before P1 can read it. P2 NACKs or does not respond to P1's BusRd until it can
1. get access to both the request address bus and the data bus
2. send line A over the data bus
3. ** receive acknowledgement from memory that memory has received the updated data.**
Once memory tells P2 that it has the updated line, P2 stops NACKing and sends the snoop response to P1 that it can go ahead and load A in the shared state.

If P2 chooses to not respond to P1's BusRd until the flush is complete, doesn't that imply that until the flush completes, no other request can get access to the request address bus because P1's BusRd is left hanging on the bus? This seems like a bad implementation because it's not making use of the split transaction bus.

On the other hand, assuming the appropriate response to a NACK is the following: 1. all caches remove a NACKed request from their request tables and 2. the issuer of the NACKed request waits for some set delay and then sends the BusRd again

then I think it would be better if P2 uses NACKs instead of just not responding because this allows other requests to be issued on the request address bus while P1 is waiting for the flush to complete.

Is this how NACKs work?

Zarathustra

I think you may have gotten P1 and P2 mixed up in step 3... :P

Kapteyn

@Zarathustra yup thanks! fixed now.

Zarathustra

(and from then on xD)

Kapteyn

Reviewing my comment above again has made me realize that I was confusing abstraction with implementation of the snooping protocol. I believe the implementation is pretty simple:

Once P2 snoops the BusRd from P1, it sends out a signal on the dirty wire and proceeds to flush the line. Meanwhile, memory knows it should expect the a flush from a processor for that line since it also snooped that the dirty line was set. P1 just waits like always for the response from memory for its BusRd. P1 knows that whenever it gets the response from memory that it should load the line into the shared state because it snooped that P2 had it in the dirty state.

It is now the memory controller's job to know that it needs to first receive the dirty line from a processor (i.e. P2, but the memory controller doesn't actually need to know which processor the data is coming from) and then send that line on the data bus to P1.

This means that the memory controller needs to maintain some additional state besides just the request table. In this case, it needs to keep track of which requests cannot be serviced until it receives a flush for that address from a processor.