I think that a CUDA thread is more like an ISPC program instance. (I'm not sure if by 'loop iteration' you mean a single program instance, or a gang of ISPC program instances.)
This comment was marked helpful 0 times.
chaominy
@jpaulson: I don't think CUDA thread == one ALU, CUDA block == one core are correct. Because one core in GPU could contain different blocks, and multiple threads may exists inside one ALU.
This comment was marked helpful 0 times.
kayvonf
@jpaulson. Be careful about the use of == since the the two quantities you are equating very different. One is a CUDA abstraction, and the other is a GPU hardware unit. Really the concept you are thinking about is the relationship between programming model concept and its implementation on a hardware unit.
Current CUDA implementations execute a CUDA thread using a single ALU in a SIMD group of ALUs. A CUDA thread block is mapped to a single GPU core, but multiple thread blocks can be interleaved on a core at once. As @mmp says, a good analogy to a CUDA thread is an ISPC program instance.
For more detailed discussions of how CUDA programs execute on modern GPUs, take a look at the next slide as well as slide 52.
CUDA thread == one ALU == ISPC loop iteration
CUDA block == one core == ISPC task
This comment was marked helpful 0 times.
I think that a CUDA thread is more like an ISPC program instance. (I'm not sure if by 'loop iteration' you mean a single program instance, or a gang of ISPC program instances.)
This comment was marked helpful 0 times.
@jpaulson: I don't think CUDA thread == one ALU, CUDA block == one core are correct. Because one core in GPU could contain different blocks, and multiple threads may exists inside one ALU.
This comment was marked helpful 0 times.
@jpaulson. Be careful about the use of
==
since the the two quantities you are equating very different. One is a CUDA abstraction, and the other is a GPU hardware unit. Really the concept you are thinking about is the relationship between programming model concept and its implementation on a hardware unit.Current CUDA implementations execute a CUDA thread using a single ALU in a SIMD group of ALUs. A CUDA thread block is mapped to a single GPU core, but multiple thread blocks can be interleaved on a core at once. As @mmp says, a good analogy to a CUDA thread is an ISPC program instance.
For more detailed discussions of how CUDA programs execute on modern GPUs, take a look at the next slide as well as slide 52.
This comment was marked helpful 0 times.