This is not an example from the producer-consumer category! This is an example of efficient input data reuse!
To expand on the above comment, this reordering is an example of input data reuse because all the matrix values stay in memory. As soon as we read a value, we know we have to eventually read it O(N) times, so we want to read it a bunch of times every time we bring it into the cache.
Although this code produces tangible performance benefits, it is only acceptable in absolutely performance critical applications. For an application where this kind of calculation is not a bottleneck or done very often, this code becomes overkill as it is much more difficult to maintain and much more prone to error.