Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2016

Snooping-Based Cache Coherence

Previous | Next --- Slide 15 of 56

Back to Lecture Thumbnails

bojianh

A processor may reorder instructions in order to run faster, but it must appear on the outside to be correct.
Note that if the processor reorder instructions, there can be several correct outcomes for 2 if the read/write are not sufficiently separated in time.

CaptainBlueBear

Is being "'sufficiently separated' in time" dependent on a particular CPU's architecture?

randomthread

I think sufficiently separated in time is most dependent on the latency of the coherence protocol. For instance with invalidation based cache coherence P2 will only know that it needs to re-fetch address X after it receives an invalidation indication from P1 for address X. Thus in this case sufficiently separated is the propagation time for the invalidation.

It is interesting to note that sufficiently separated can have different meanings for two different processors. If P2 and P3 read X at the same clock, then depending on the write propagation times they might read different values (one with P1 update and the other without). In this case the cache controllers would have to be on a more complicated topology than a simple bus.

Perpendicular

Note that this for the same memory location X. Processors can see different orders for two different memory locations. There does not have to be a serial global order for all memory locations but just for one location.

FarmerScrub

@Perpendicular You have to be careful because coherence applies across different memory locations that are on the same cache line. For example:

Thread 1 writes to address 100

Thread 2 writes to address 104

Both threads flush their caches to memory.

Coherence is necessary otherwise one of the address will not be updated properly.