Previous | Next --- Slide 19 of 36
Back to Lecture Thumbnails
smklein

In class, the question was asked "If we have multiple processors running at the same time, how can we draw this timeline with each action happening in a separate, specified order?"

The answer is essentially that the interconnect uses total-store ordering, allowing us to model "cache differences" (misses + invalidations) in this ordering. Cache hits and other, non-cache operations are not bound to this ordering because the interconnect (and other processors) wouldn't care about these operations.

cardiff

Note: one may wonder why caches need to broadcast when they encounter a cache miss. In this example, it's not strictly necessary, but will become more important in the later models in this lecture.

yrkumar

Just to reiterate why this strategy works: a write operation by Processor X will result in Processor Y's cache having stale data if it contains the line with the address of the write. Thus, a solution to this problem is for Processor X to broadcast upon writing, which will cause Processor Y to invalidate that line. Thus, if Processor Y tries to access that address after Processor X writes to it, a cache miss will occur and fetch the updated data from memory.

kayvonf

@yrkumar: solid work in that description!

kayvonf

I'd like everyone in the class to make sure they understand the invalidation must occur "before" P0's write completes. The cache coherence protocol specifies that P0's write does not officially complete until P0's cache controller confirms that all other processors have invalidated their copy of the cache line.

Question: Is cache coherence preserved if P0's write completes AND THEN the invalidations occur?

smklein

@kayvonf:

No, cache coherence is not preserved if the write completes, and invalidations occur afterwards.

What if P1 tried to load X before P0's invalidation had propagated? The definition of coherence includes "the value returned by a read must the the value written by the last write on ANY PROCESSOR".

If P1 was capable of loading X from its cache, and not from P0's recent write, then coherence is lost.

LilWaynesFather

What happens if two processors write at almost the exact same time (the difference in time is smaller than the time it takes for their "shout" to be broadcasted to other processor)? Wouldn't both processors have a different value for the same location in memory? Would one processor just drop the line, load the updated line, and redo the write that it was gonna do?

kayvonf

@LilWaynesFather. You should assume that the invalidation must be confirmed prior to the write occurring. Therefore the situation you describe cannot occur.

ron

I'm guessing it's not possible for the following interleaving of operations to happen?

  1. P0 initiates a write to location X.
  2. P0 broadcasts the write to the other cache controllers.
  3. P0 confirms that the other processors have invalidated their caches.
  4. P3 attempts to read from location X - dirty miss.
  5. P3 brings the line containing X in from memory - the line hasn't been updated by P0 yet.
  6. P0 writes to location X in memory.