I get the idea of limited pointer scheme that when the number of processor p is very large, we can reduce the storage requirement by only store the id's of the processors which have a copy, which is usually much fewer than p.
But how does sparse directories work? If we only keep directory entries for those lines in cache, how do we tell if who holds a particular memory address? Do we have to scan the every entry in the directory to see if anyone matched and is in cache? Does each cache line still have a fixed location residing in the directory?
@haibinl it was mentioned in class that the host of a particular memory address was easily computable.
Since it is hard to implement a doubly linked list in hardware, why do we do this in the first place? What is the point in chaining the lines (pointers) by this data structure?
I'm also wonder why we have such linked list data structure. If the directory is designed as slide 23, where each line in memory has a reserved entry in the directory, we will have a huge directory which takes lots of storage.
Now with sparse directories, we're only keeping track of the memory lines which are in the cache, and it doesn't have as much storage as previously. In this case, it's possible that multiple cache lines are mapped to the same entry location in the directory, which may explain why we have these linked lists?
The list keeps track of which processors have the line cached.
With sparse directories, we only need to track the lines we have cached in addition to the lines cached from our local memory.
This means that now if we had a system with 1024 processors, we only have 20 bits of overhead in addition to the line data instead of 1024 bits.
Why does the linked list have to be doubly linked? I'm not sure about the linked list implementation in circuit level - won't it be easier if we just implement a singly linked list? What kind of advantage does doubly linked list offer? Do we need to traverse the linked list in reverse order at some time? Doesn't the traversal always start at the home directory?
from the slide, seems the optimization comes from store only the address of first node that cache data in memory, but this is not exploring the fact that "majority of memory does not resident in cache". instead, this seems a lot like another approach to reduce size of P..
Removing an item from doubly linked list takes O(1), but in singly linked list, it takes O(N). When a line is evicted, we need to remove it from the linked list. In this case, the doubly linked list is superior.
How about the dirty bit in bit vector, how does it represent the dirty bit here?
If there's a write hit in the middle of the linked list wouldn't you have to propagate invalidations in both directions?
@IntergalacticPeanutMaker You could also just send the write to the owner of the data, and then traverse the list the same as a singly linked list.
Speaking in data structure terms:
write: linked-list traverse
read miss: add node to linked-list
evict: delete node from linked-list
@hofstee What do you mean by the owner of the data? As I understand it every cache line in these types of liked lists holds a copy of some data stored at a common memory address
So don't they all "own" the data?
@IntergalacticPeanutMaker by owner I mean home node.