Previous | Next --- Slide 26 of 51
Back to Lecture Thumbnails
MaxFlowMinCut

The key assumption here is that in practice, many fewer than $P$ processors will be referencing a cache line at any given time. By instead having a fixed number $n$ of pointers per cache line that each reference a processor that is referencing that cache line (if there is one), we instead have the memory overhead of $n \cdot \log_{2}(P)$ bits to keep track of a cache line, rather than the full $P$ bits (which quickly becomes untenable as $P$ gets larger).

sasthana

@MaxFlowMinCut - I think I have understood this slide the way you have articulated it here. Taking an example with reference to the slide here, instead of keeping it general, I hope this is what the slide means.

If we decide we need only 30 bits per cache line instead of 1024 then we can allow sharing among at most 3 processors. This is because we need 10 bits to label a single processor if there are 1024 processors.

If it so happens that a fourth processor wants to share the line as well then we can use one of the ways listed on the next slide. Like fallback to broadcast or eviction.

a

Would we ever want to use the numbers described in this slide? In this example, would we ever want to store 100 10-bit pointers over a full bit vector scheme?

yikesaiting

I think this graph clearly implies one important concept of first half of semester: "Simple strategy for most cases and tolerable complicated scheme for corner cases." And I think that is what Prof. Keyvon wants to tell us