Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

Previous | Next --- Slide 15 of 55

Master

It's somewhat related to eventual consistency in distributed systems. The most important principle is that older write requests will not overwrite newer ones.

It seems that in the 418 setting, we don't have something like Lamport clock or Vector clock to achieve a global ordering of loads and stores (as mentioned in the slide). So as long as all processors observe the writes in same order, it can be considered "write serialization"? Even in the example in the slide, if all processors observe "b" first then "a", that's OK?

rootB

@Master, I think so, the "global ordering" is the ordering of the writes observed by any processors as long as all processors agree with it.

POTUS

I'm kind of confused by the content on this slide. It says at the top that "two writes to address X by any two processors are observed in the same order by all processors". So why is it that P3 and P4 can observe different order of writes?

hzxa21

@Master I think the consistency here is different from the consistency in the distributed system or transactional databases system. Because we are talking about how to ensure the cache behavior in the low-level architecture to be correct, we actually don't care about the upper-level consistency and leave it to the programmer to deal with that. For example, if you write a program using two pthreads to write "a" and "b" to the address X respectively, all you expect from the architecture is that there will be no "inconsistent" cache (no stale values) because the abstraction behind pthreads is that they can run in parallel. So if you want to enforce the ordering of the write or deal with the consistency issues you have mentioned, you need to do blocking or synchronization explicitly in your program or using appropriate upper-level protocols.

rrp123

In distributed systems, these consistency issues are very common as well, but in those situations they are solved by majority vote algorithms, such as Paxos. In this case however, we will not be to use such algorithms because the overhead of having processors communicate P2P will be very high, and we would rather use the interconnect.

life

@POTUS: I think the example just shows what will happen without the third coherence definition