Static assignment is like the programmer assigns the fixed amount of tasks to different workers. Dynamic assignment is like the workers claim the tasks themselves as long as they finish the task at hand. Dynamic assignment is faster in that the workers are aways busy until all the work is done. In static assignment, it's possible that some workers are faster; they finish earlier and just wait for other workers.
I was wondering whether dynamic assignment is always superior than static assignment. If this is not the case, then in what situation should we resort to static assignment instead of dynamic assignment?
@haboric I think that if we know the optimal way to assign tasks, we can achieve a notable speedup over dynamically assigning tasks to whichever worker thread happens to be free at the moment. I'm guessing this would be an application of the workload distribution problems in Algorithms.
@haboric For example, if we as programmers know for sure that we have one task that will run for, say, 3 minutes, 2 tasks that will run for 2 minutes, and one task that will run for 1 minute, dynamically assigning tasks to 2 cores could cause the program to run for 5 minutes (3 2 on one core, 2 1 on the other). However, we can statically assign the tasks so that the program runs for 4 minutes (3 1 and 2 2). Dynamically assigning tasks where the list of tasks is unordered is 2-optimal. The best dynamic/online algorithm so far can only achieve 1.9201-competitive (https://en.wikipedia.org/wiki/Job_shop_scheduling).
I know that it was discussed in class that the use of tasks might be favorable to that of pthreads. However, what exactly is it in their different implementations that makes it so. This was discussed slightly in class but I don't think we went far enough to talk about the specifics on their implementations that lead to such a difference in efficiency.
@chuangxuean I think this is at least part of it: When you launch a bunch of threads, they all will be running concurrently and will compete for resources. When you launch a bunch of tasks, they aren't necessarily running concurrently. The scheduler is free to schedule them in an efficient way, for example by scheduling only as many tasks to run concurrently as the number of cores.
What are some of the synchronization primitives used by ISPC for assignment since the task list is a shared resource? And is a time when the synchronization of the task list going to be a bottleneck in parallel process?
@bojianh I would speculate that standard synchronization primitives are used by ISPC (queue with mutex / lock-free queue). As you mentioned, synchronization can definitely add an overhead to computation, and if we have, say, $N=500000$, and create $500000$ tasks for the program, then I would guess that the overhead is huge.
In practice, I think it's useful to benchmark the performance of the program using different number of tasks (like in Assignment One).