Previous | Next --- Slide 8 of 24
Back to Lecture Thumbnails
Xelblade

Increasing the size of the entire cache would only decrease capacity misses, which would really only help Ocean Sim.

Cold misses decrease with cache line size because of spatial locality. Loading one value in the line gets you the rest of the line, which you don't have to load again.

Conflict misses decrease with cache line size for the same reason as cold misses; there's a smaller chance that we have to load another line.

True sharing decreases with cache line size since when lots of contiguous data must be written and read, it can fit in fewer cache lines.

False sharing increases with cache line size because there's a greater chance of artifactual communication.

jpaulson

Increasing cache line size reduces true sharing because the same amount of data fits in fewer cache lines, and you take one miss per cache line.

bourne

False sharing isn't really an issue with a program like Ocean Sim because almost all of the memory it needs will be in the same cache lines due to locality. Something like radix sort requires accessing memory from many different areas in memory but only need a few elements at a time, so it will bring in an entire cache line for one piece of data and result in false sharing.

kayvonf

@Xelblade: A correction to your comment (which was otherwise great). The graph plots conflict + capacity misses in the same bar. Capacity misses will decrease with increasing cache line size when high spatial locality is present. Conflict misses however are increase with cache line size since the bigger lines will result in fewer sets and thus the potential for more conflicts. Overall (capacity+conflict) misses are going down here with increasing cache line size, presumably because of a combination of effects.

The true sharing and false sharing results on this graph are the really important ones in the context of this lecture.

kayvonf

@bourne: False sharing can indeed by a problem in applications like Ocean, for the reason given in this slide. However this graph clearly shows that in this experiment, most of the coherence misses are true sharing and not false shading misses. This means that most of the data in cache lines that are communicated is in fact used by the application.

kuity

A question about true sharing and increased cache line size: My understanding is that since shared data now fits on fewer cache lines, processors invalidate less cache lines due to true sharing. However, the cost of eviction is now higher because of increased cache line size. Overall, does this lead to a worse or better result?

stephyeung

@kuity: I think per this later slide, it depends, but the latency is okay if you have a way to handle it like multi-threading.