By this slide, it seems that the only difference between int x = 10 and x = 10 is that in the second case, there may not be a TLB miss, so that the cache can lookup tags in step 3. Is there any other difference?
int x = 10
x = 10
(Another question, not an answer to the above)
int x = 0;
is on the stack right? We know that threads have their own stack space. So wouldn't it be the case that there won't be any need for invalidation in this case?
@rmanne i think you are right. 'int x = 10' creates a local variable; the steps described in slides fit 'x = 10' better, which needs to fetch data from heap maybe.
The "stack" is still part of the processes address space. And in this class we've assumed coherence is enabled for all addresses in the address space. Therefore, all addresses are subject to the coherence protocol.
It's also certainly the case that I could write a program where thread T0 passed a pointer to a variable on its local stack to another thread. Just because data is on the stack doesn't mean it cannot be shared.
Of course, you could consider designing a system where coherence is only enabled for some parts of the address space (e.g., coherence is enabled/disabled on a per physical page basis). And in fact this is true of modern integrated CPU/GPU systems.
I am little confused about step 16 which said memory wins bus. Is there anything competing with the memory to get access to the respond bus when we are using split transaction bus?
@1pct Memory is competing with other caches, say if other caches are servicing a miss.