Here, I see the dependencies include every point adjacent to the point we're looking at (up, down, left, and right), but in lecture Kayvon only identifies the points that are above and to the left of the point as the dependencies. Why is this?

bojianh

@maxdecmeridius

Take note that we are going right and then down in terms of 2 for-loops.

-->

1 2

Suppose we are processing "1" at time t, hence it is correct to obtain the value of "2" at time t and updating "1". When we are processing "2" at time t+d later, it is correct to obtain the value of "1" at time t+d.

However, imagine we reverse the processing of "1" and "2", in this case "2" will be getting a older copy of "1"s value and "1" will be getting a newer copy of "2"'s value which will not be correct.

ojd

For those who are interested, Gauss-Seidel falls into a class of algorithms known as Stencil Computations. There are many algorithms available for automatic detection, parallelization, and optimization of this class. The central idea for most of these methods is that if you can bound the dependencies within a constant-size cone, you can generate a new set of (potentially more optimal) basis vectors for the loop iteration and so loop optimization becomes an affine transformation on the polytope generated by the dependencies of the nested loops. The method closest to what was shown here is known as Diamond Tiling, while the more GPU friendly variant is known as Hexagonal Tiling. (The names come from the shape that you'd draw to illustrate work batching on a 2d dependency grid in the same style shown in these slides)

Here, I see the dependencies include every point adjacent to the point we're looking at (up, down, left, and right), but in lecture Kayvon only identifies the points that are above and to the left of the point as the dependencies. Why is this?

@maxdecmeridius

Take note that we are going right and then down in terms of 2 for-loops.

-->

1 2

Suppose we are processing "1" at time t, hence it is correct to obtain the value of "2" at time t and updating "1". When we are processing "2" at time t+d later, it is correct to obtain the value of "1" at time t+d.

However, imagine we reverse the processing of "1" and "2", in this case "2" will be getting a older copy of "1"s value and "1" will be getting a newer copy of "2"'s value which will not be correct.

For those who are interested, Gauss-Seidel falls into a class of algorithms known as Stencil Computations. There are many algorithms available for automatic detection, parallelization, and optimization of this class. The central idea for most of these methods is that if you can bound the dependencies within a constant-size cone, you can generate a new set of (potentially more optimal) basis vectors for the loop iteration and so loop optimization becomes an affine transformation on the polytope generated by the dependencies of the nested loops. The method closest to what was shown here is known as Diamond Tiling, while the more GPU friendly variant is known as Hexagonal Tiling. (The names come from the shape that you'd draw to illustrate work batching on a 2d dependency grid in the same style shown in these slides)