Previous | Next --- Slide 31 of 36
Back to Lecture Thumbnails
retterermoore

How are these finite state machines usually implemented? I imagine it would have to be something very efficient.

pwei

At least in terms of digital design, finite state machines of 5 states require very little actual hardware, and can advance as fast as a clock cycle (I'm not sure if this clock is the same as the CPU or GPU clock cycles, but I imagine it could be). It would require 3-5 D-flip flops based on implementation (~36 to ~60 transistors), and some input and output logic to go along with it (<100 transistors). On chips with monstrous numbers of transistors (in the billions), this is almost nothing. In terms of speed, I'm pretty sure that the flip flops can handle even up to GHz for clock frequency, and so the finite state machine should not interrupt any part of the caching.

kayvonf

Question: How does MESIF work? Specifically, how is the cache that supplies data in the event of a cache miss determined in MESIF? (BTW: Wikipedia has a nice description)

Question: How does MOESI work? (again, see Wikipedia)

pinkertonpg

In the MESIF protocol, the M, E, and I states are the same as in MESI, but it adds a new F (forward) state that is a particular flavor of the S state. In MESI, if a processor has a cache miss, there are two ways to get the data. Either the data is read from main memory (slow), or all of the caches that also hold that particular line can respond (redundant communication). Instead, in MESIF, a cache line in one cache, and only one cache, is designated as being in the F state. This way only one cache needs to forward the data to whichever other cache is requesting it. This assures the data transfer is both minimal (only one cache sends the information as opposed to all that hold the line), and quick (cache-to-cache speeds).

The MESIF protocol determines the F-state cache by passing along the F state to the most recent cache that requested the line. So, if cache x receives data from another cache y about a particular line, then cache x is now in the F state for that line, and cache y transitions to the S state. It is possible that the cache with that line in the F state invalidates that line before another cache wants it, so that no cache has that line in the F state. In this case, the data is read from main memory and the new cache is designated in the F state. By picking the cache that holds a line in the F state based on a most-recently-used policy, hopefully this event occurs infrequently.