Slide View : 15-418/618 Spring 2014

Performance Optimization II: Locality, Communication, & Contention

Previous | Next --- Slide 20 of 42

yanzhan2

Cold miss(compulsory miss): First reference to any block would always be miss, usually happens at the start of a program since data has not been accessed before, unless the block is prefetched. Could be reduced by increasing block size of prefetch.

Capacity miss: miss due to cache size is limited, could be reduced by building a larger cache. But cache size are always limited for fast access, so a hierarchical L1, L2, L3 cache structure is used.

Conflict miss: miss due to associativity, would have a hit with a fully associative cache. But fully associative cache is limited by the associative search for all the locations, so it is slow and not very practical.

Communication miss: miss due to transfer of data between different caches, would cause a big problem if data ping-pong between caches. For example, different processors need to update a location which is shared, in shared memory system, always need to get exclusive access state before write the data.

This comment was marked helpful 0 times.

moon

Question: I understand that cold misses occur when data has not yet been accessed and therefore can't be in the cache already. This slide notes that they are "unavoidable in a sequential program". Are they therefore avoidable in non-sequential (I'm assuming this means parallel) programs?

This comment was marked helpful 0 times.

yrkumar

@moon: I don't think that cold misses are avoidable in a non-sequential program. The point of a cold miss is that it happens as a result of the cache being empty, which is true at the beginning of a parallel program as well. I think the only way to avoid a cold miss is to do some sort of pre-fetching data to populate the cache, but this would also result in a cold miss while prefetching, so it isn't really a solution.

This comment was marked helpful 0 times.

kayvonf

Consider a program with two threads running on a processor with two cores and shared cache between the cores. Thread 0 reads address X for the first time causing X to be loaded into the cache. Then, at some later point thread 1 reads X for the first time. While by definition this should be a cold miss for thread 1 (this thread has never accessed this data before), the access is actually serviced by the cache since it was previously loaded by thread 0. In a sense, thread 0 has effectively prefetched the value for thread 1, so from thread 1's perspective it avoided what otherwise would have been a cold miss when considering the behavior of thread 1 alone.

This comment was marked helpful 0 times.