Previous | Next --- Slide 39 of 43
Back to Lecture Thumbnails

With multiple bases, we run into the problem of determining how we choose a base and how we know which to use when we are decoding. Ideas for choosing a base include, using the mean, sorting the bytes, or taking the min/max.


Just as a refresher this sort of pattern appears in just about any data structure where pointers are involved.


@makingthingsfast, I think it was mentioned in class while sorting is a good idea, it would take too much compute. So, having a strategy which goes through each dota point once would be optimal.


How does this compression interact with cache coherence? It seemed that getting cache coherence on the compression may be difficult.


@bojianh it's still the same cache line as before so why would it make a difference?


I mean the reason one want to do compression is to have more cache lines to fit in cache. But then we would need all those bits to implement MESI which may be complicated.


In the paper I can find on this, it looks like they put a piece of compression hardware in front of the memory. Since this implies the memory only sees compressed data, coherency shouldn't be an issue. This does seem like it would break the assumptions of some programs, but presumably this isn't too big of an issue in practice.


With this compression strategy, how would we uncompress the data? Since the bytes are laid out continuously in cache, how would we know where to divide the bytes of consecutive elements since each element can be compressed with different number of bytes.


You could add 2 bits to every element saying how many bytes it takes.