Slide View : Parallel Computer Architecture and Programming : 15-418/618 Fall 2016

Previous | Next --- Slide 49 of 56

tclarke

Noting that this increases graphics driver overhead but also increases the amount of precious global memory available to the program.

taoy1

So do we need to worry about reading stale data in a single kernel launch?

For example,

SMM core 0 reads x (so it resides in L1)

SMM core 1 writes x (write-through, data available in L2)

SMM core 0 reads x (cache it, read stale data in L1)

Split_Personality_Computer

@taoy1 I think this is the case where you'd want to mark x volatile since another core can modify it. I know some GPU languages have different data structures for read-only versus read-write; potentially read-only data structures are not marked volatile while read-write data structures are marked volatile to force everyone to communicate through the L2.