Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2016

Previous | Next --- Slide 22 of 51

jsunseri

I'm interpreting the last point to mean that these optimizations won't break a single-threaded program, and thus are "valid" optimizations in that case. In contrast, as the examples in this lecture show, these optimizations can break or at least change the behavior of a multi-threaded program.

maxdecmeridius

These reasons also apply to superscalar instruction fetching, right?

grarawr

Would relaxing these reorderings affect correctness of the program?

yimmyz

Overall, the motivation for memory access recording is to hide latency of costly memory accesses (e.g. W's compared to R's). And, the synchronization primitives (e.g. memory fence) covered in the lecture makes sure that the program is still well-behaved with the optimization (e.g. the question in Exam One).

grizt

@grarawr I believe that these are all assuming that reordering will not affect the correctness of a program with a single instruction stream.

For example, W -> W can occur in

a = 5; b = 7;

but cannot occur in

a = 5; a = 7;