Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

Previous | Next --- Slide 27 of 52

bazinga

Since ARM reorders writes, it may use fences to ensure atomic writes to "ready". x86 doesn't reorder writes so will not need the fences.

googlebleh

@bazinga That's not always true. Maybe in this scenario, and under the assumption that x86 doesn't reorder writes, x86 doesn't need a fence. However, x86 does provide fences; the MFENCE, LFENCE, SFENCE instructions are a few examples of memory barriers that can be inserted to ensure ordering of memory accesses.

Sometimes, reordered reads can affect program correctness, so x86 programmers may need LFENCE. On the other hand, under certain circumstances x86 does mess with the order of writes with respect to other instructions. According to page 3076 of the Intel 64 and IA-32 SDM, x86 processors are capable of "write combining," which is the act of delaying writes to be combined with other writes. This is one of the scenarios in which an x86 programmer may need MFENCE or SFENCE.

pht

How does the processor efficiently implement atomic instructions without mutexes? Is this something under the hood that as programmers we don't have to worry about?

eourcs

@pht I'm not really an OS person, but in general, implementations of atomic instructions and mutexes depend on architecture, so it can be difficult to compare the two. However, atomic instructions, assuming there is no direct instruction translation, are usually hints to the compiler to put things like memory fences in order to force a certain ordering.

In the context of parallel computing, I would gather that unless you are writing highly architecture-specific code (i.e. super non-portable), it shouldn't really matter how these things are implemented as long as you understand their implications in the abstract sense.