Slide View : Parallel Computer Architecture and Programming : 15-418/618 Fall 2016

Previous | Next --- Slide 48 of 56

Iamme

Suppose I'm programming on a GPU and I want to use a data structure that requires locks due to modifications from multiple threads.

It seems like the lack of guaranteed cache coherence makes my locks useless, both because the caches may not have the most up to date data structure and because they may not even have the same value for the lock!

Is there some ways to account for this and to use such a data structure anyway?

anamdev

I think for this situation it is on the programmer to add sync lines that waits until a thread has updated a value in global memory to allow the next thread to access the updated data. Also each block (of up to 512 threads) has its own cache. The main way you might run into this issue is if threads from different blocks are updating global memory. If this is the case, it might be better to store these differences in shared memory first, then combine the shared memory from different blocks later to update global memory.