Previous | Next --- Slide 41 of 49
Back to Lecture Thumbnails
mschervi

Question Would it be reasonable to think of rays in a packet being implemented as threads in a warp in CUDA? The reason I am not sure is because they seem to share similarities such as executing in SIMD, but a warp is a lower level implementation that the programmer would not necessarily be able to manipulate (unless it's exposed in more recent versions of CUDA) and the packets seem like a higher level abstraction (as the next slide talks about reordering rays). Could this reordering be implemented by assigning work to different threads?

sfackler

Warps are decently exposed in CUDA. Warps are (currently) always 32 threads. The allocation of threads to warps is well defined. Thread have a thread ID. For 1-dimensional blocks, this is simply threadIdx.x, for 2-dimensional blocks, blockDim.x * threadIdx.y + threadIdx.x and for 3-dimensional blocks, blockDim.x * blockDim.y * threadIdx.z + blockDim.x * threadIdx.y + threadIdx.x. Threads with IDs 0-31 run in the same warp, as do 32-63, etc.