As mentioned in class, one of the benefits of using a keyword such as forall as opposed to explicitly spawning two pthreads is that the compiler can intelligently decide how many threads to spawn to take the largest possible advantage of the underlying architecture. For example, if we had a CPU with 4 cores, each with 8 ALUs, using forall might cause the compiler to spawn 4 threads and use 8 wide SIMD AVX instructions. At peak performance this could run 16x faster than the two thread approach taken in this slide.
Is there an actual way to denote the independence of each iteration other than using a forall keyword though?
Any case where the number of threads are hardcoded ever utilizes the CPU properly because the program can just be moved to another CPU with a different number of cores and ALUs, and the benefits are lost.
Shouldn't sinx(args->N,args->terms,args->x,args->result); in my_thread_start use thread_args instead of args?
@zvonryan That's right; or the previous line can simply be changed to: my_args *args = (my_args *)thread_arg;
my_args *args = (my_args *)thread_arg;
@krillfish The forall keyword is just a keyword Kayvon made as an example of what a parallel oriented language would use to make it easier on the programmer to implement a parallel algorithm.