Is being "'sufficiently separated' in time" dependent on a particular CPU's architecture?
I think sufficiently separated in time is most dependent on the latency of the coherence protocol. For instance with invalidation based cache coherence P2 will only know that it needs to re-fetch address X after it receives an invalidation indication from P1 for address X. Thus in this case sufficiently separated is the propagation time for the invalidation.
It is interesting to note that sufficiently separated can have different meanings for two different processors. If P2 and P3 read X at the same clock, then depending on the write propagation times they might read different values (one with P1 update and the other without). In this case the cache controllers would have to be on a more complicated topology than a simple bus.
Note that this for the same memory location X. Processors can see different orders for two different memory locations. There does not have to be a serial global order for all memory locations but just for one location.
@Perpendicular You have to be careful because coherence applies across different memory locations that are on the same cache line. For example:
Thread 1 writes to address 100
Thread 2 writes to address 104
Both threads flush their caches to memory.
Coherence is necessary otherwise one of the address will not be updated properly.