Previous | Next --- Slide 22 of 54
Back to Lecture Thumbnails
unihorn

In my view, a CUDA thread is much like an instance in ISPC arch. All CUDA threads in a warp run according to SIMD rule as the gang in ISPC, which is designed to utilize ALUs fully and is an issue of implementation. Thread block, which is set by user, consists of warps. What makes the concept of thread block important is that threads in the same block can synchronize easily and use shared memory efficiently. In the implementation, threads in a block are surely assigned to one core and will not be revoked from the core until it finishes(due to independence). A set of thread blocks is designed by programmer to solve problems. In implementation, a certain hardware controls the distribution of thread blocks to cores. One core may hold many thread blocks to hide latency. Unscheduled thread blocks will be assigned to some core by the hardware later, which reminds me of task in ISPC.