Slide View : 15-418 Spring 2013

Cache Coherence I

Previous | Next --- Slide 7 of 35

Back to Lecture Thumbnails

kayvonf

Question: Why do you think did Intel chose to build 8-way associative L1 and L2 caches, but a 16-way L3 cache?

Question: What does "outstanding misses" mean?

This comment was marked helpful 0 times.

askumar

A 16-way associative cache will have a lower miss rate than an 8-way associative cache. Having a lower miss rate will reduce the number of times you have to go to memory to retrieve data - which is very expensive. I remember that it was said that 16 way associativity is more expensive than 8 way associativity, but how exactly is it more expensive?

This comment was marked helpful 0 times.

kayvonf

@askumar: By expensive, I meant it is a more complicated cache design. Once the cache determines which set an address to load resides in (using the middle bits of the requested address), it must check the tags for all lines in the set to determine if a valid cache line containing the line is present. There's twice as many lines to check in the 16-way cache than in an 8-way cache. This translates into the need for more cache hardware to perform more tag checks in parallel, or a longer cache lookup latency. The cache also has to manage a replacement policy for more lines.

Although I'm not an expert on this, I imagine it's more economical to build a 16-way cache as an L3 because you can get by with an implementation that's slower when performing its tag checks. First, as the last-level cache, the L3 is accessed less frequently. Second, an access that reaches the L3 has already incurred major latencies by missing the L1 and L2, so an extra cycle or two in the L3 access critical path doesn't have a huge relative effect on performance. In contrast, the job of the L1 (which services requests all the time) is to return data to the processor with as low latency as possible.

This comment was marked helpful 0 times.

tpassaro

Outstanding misses refer to the number of misses before you try the next level of cache. Since L1 is shared, I think its saying we try up to 10 times to find cached data in one of the 8 associative sets.

This comment was marked helpful 0 times.

kayvonf

@tpassaro: Not quite. The idea of outstanding misses is unrelated to how data "is found" in a cache. Anyone else want to give my question a try? Hint: out-of-order processing.

This comment was marked helpful 0 times.

mschervi

From what I've read, I think the concept of outstanding misses is related to a "non blocking" cache. If a processor asks the L1 cache for data and there is a cache miss, instead of waiting for the request to be serviced by the L2 cache, the L1 cache has mechanisms to continue working on other requests. The first request that had to go to the L2 cache is the "outstanding miss." The L1 cache in this diagram could have up to 10 requests for such misses out at a time before it has to block.

This comment was marked helpful 2 times.

mbury

If you are asking yourself like I did what N-way associativity means, here is the answer :

Since the cache is much smaller than main memory, it has to have some mechanism to map main memory addresses to cache lines (not unlike a hash function). If any address can go to any cache line, the cache is called fully associative. If an address can only go to one single cache line, the cache is instead called directly mapped. Finally, if an address can be mapped in one of a set of N cache lines, the cache is said to be N-way associative.

This comment was marked helpful 0 times.