Remember, the closer you get to the thread, the cheaper memory accesses are.
This comment was marked helpful 0 times.
max
All threads can share memory if they are in the same block.
There is one instance of shared memory per block, while there is also one instance of local memory per thread, and one instance of global memory which is written to by all threads.
This comment was marked helpful 0 times.
miko
Question:
Does the 'device global memory' in this diagram refer to the memory of the GPU? If so, does the GPU's memory actually behave in exactly the same manner as main memory does (perform caching and such) or are there some small discrepancies between the two?
This comment was marked helpful 0 times.
kayvonf
Yes. In modern GPUs device global memory corresponds to high-performance DDR5 DRAM resident on the GPU board (but not on chip). You can think of this memory just as you think of main system memory accessible to a CPU (typically DDR3 these days). The GPU does cache part of this address space, although GPU caches tend to be smaller than those on a chip.
This comment was marked helpful 0 times.
DanceWithDragon
Use a table to conclude this part.
Suppose we have M blocks, and each block has N threads.
Remember, the closer you get to the thread, the cheaper memory accesses are.
This comment was marked helpful 0 times.
All threads can share memory if they are in the same block.
There is one instance of shared memory per block, while there is also one instance of local memory per thread, and one instance of global memory which is written to by all threads.
This comment was marked helpful 0 times.
Question:
Does the 'device global memory' in this diagram refer to the memory of the GPU? If so, does the GPU's memory actually behave in exactly the same manner as main memory does (perform caching and such) or are there some small discrepancies between the two?
This comment was marked helpful 0 times.
Yes. In modern GPUs device global memory corresponds to high-performance DDR5 DRAM resident on the GPU board (but not on chip). You can think of this memory just as you think of main system memory accessible to a CPU (typically DDR3 these days). The GPU does cache part of this address space, although GPU caches tend to be smaller than those on a chip.
This comment was marked helpful 0 times.
Use a table to conclude this part. Suppose we have M blocks, and each block has N threads.
This comment was marked helpful 0 times.