Previous | Next --- Slide 48 of 62
Back to Lecture Thumbnails
sbly

What exactly is this dynamic scheduling policy? And what does it even mean to "respect resource requirements"? Aren't all cores identical, so shouldn't it not matter which block we assign to which core?

vrkrishn

If you recall in ISPC, we identified multiple tasks that were data-parallel such that the OS could schedule the tasks to execution units at its leisure. However the OS had no knowledge before execution about how much time each task would take (if you recall there were some rows that took longer to calculate and some that took less time). If the OS assigned and even number of tasks to each processors before execution (static assignment), the different execution units might end up taking vastly different times to complete their tasks.

Therefore, the OS makes a queue of the different tasks. It initially distributes one task to each execution unit and only when that unit is done will another task be assigned to it. This is dynamic assignment because the OS is determining which processor individual tasks are run on during execution of the program.

For example, imagine there is are two processors, one can process one task per unit time and the other can perform 3 tasks per unit time. A static assignment solution to divide the tasks in half would mean the system would have to wait n/2 units of time to finish all the tasks (slowed down by the second processor). In dynamic assignment, the fast processor would request 3 tasks per unit time while the slow processor only would ask for 1 task. Thus 4 tasks per unit time will be completed and the execution time is n/4. Assigning 3:1 distribution of load is not intuitive before execution but actually looking at how the processor performs while running allows the OS to adapt to a different solution.

vrkrishn

For the second part, if I recall resource requirements for blocks are that all threads on a block must be run concurrently. Therefore, as much as a dynamic scheduler might want to split a block up, CUDA intrinsics about blocks prevent that assignment.