The write buffer (post retirement store/write buffer) is a common design in modern processors. The idea is that store (write) operation does not need to perform immediately, so we can put them into some buffers to reduce memory traffic while check the buffer for later load instructions to guarantee correctness.
But the write buffer could cause these memory instructions to be seen by others processors out-of-order, which relates to the memory consistency. For example, for TSO, a FIFO write buffer could be used for each processor for performance. While for PSO, a non-FIFO coalescing write buffer could be used as the store instructions are not ordered.
This comment was marked helpful 0 times.
uhkiv
So if we were to access X, do we need to check the buffer for any writes to X that are waiting? What are the performance implications of this?
This comment was marked helpful 0 times.
kayvonf
@uhkiv: Absolutely. A subsequent read of X needs to check both the cache and the write buffer to determine where the most recent copy of the value is. These checks would almost certainly be carried out in parallel.
The write buffer (post retirement store/write buffer) is a common design in modern processors. The idea is that store (write) operation does not need to perform immediately, so we can put them into some buffers to reduce memory traffic while check the buffer for later load instructions to guarantee correctness.
But the write buffer could cause these memory instructions to be seen by others processors out-of-order, which relates to the memory consistency. For example, for TSO, a FIFO write buffer could be used for each processor for performance. While for PSO, a non-FIFO coalescing write buffer could be used as the store instructions are not ordered.
This comment was marked helpful 0 times.
So if we were to access X, do we need to check the buffer for any writes to X that are waiting? What are the performance implications of this?
This comment was marked helpful 0 times.
@uhkiv: Absolutely. A subsequent read of X needs to check both the cache and the write buffer to determine where the most recent copy of the value is. These checks would almost certainly be carried out in parallel.
This comment was marked helpful 0 times.