Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

Previous | Next --- Slide 51 of 81

apadwekar

If we wanted to think of CUDA in ISPC terms, blocks are similar to tasks and then each thread is similar to an individual SIMD lane.

eosofsky

To add on to @apadwekar's comment, a CUDA warp is similar to an ISPC gang (each CUDA thread in a warp is running in a SIMD lane, much like an instance in an ISPC gang).

username

So I know that the "shared" keyword is used to specify that memory being allocated is being allocated for the block to share, but how would you specify whether you want to allocate memory on a per thread basis versus a global basis.

o_o

So, just summarizing thread hierarchy. Threads run a kernel execution for an index. Warps are groups of threads that run the same instructions. Thread blocks are groups of warps that have access to the same shared memory. Is this correct?