Previous | Next --- Slide 49 of 57
Back to Lecture Thumbnails
cmusam

With separate foo and bar functions, tmp is stored to and then loaded from memory (unnecessary bandwidth usage). The compiler may detect this and optimize by just storing tmp in a buffer, e.g. register.

stride16

The code provided on the slide only reads from memory once, written as:

output[i] = bar(foo(input[x]));

This code can be written in several different ways, which can lead it to being misinterpreted.