In lecture, we noted that if we rewrote this code to allow the compiler to determine how many threads of execution should be spawned (based on the number of cores of the machine), we would better utilize the machines resources.
Are there any cases in which it is advantageous to use pthreads instead?
@Cake What do you mean by "allow the compiler to determine how many threads of execution should be spawned"? Is it the hardware thread you are referring to?
@xiaozhuyfk Isn't the number of thread limited by the number of cores available? Thread is an abstraction provided by the OS, and I don't understand what you mean here by "hardware thread".
@Cake Do you mean using some kind of predefined macros, so that the compiler can use it to create the appropriate number of threads?
In general, how do we reason about how many pthreads to spawn given a system with n cores. Assuming that we can make an arbitrary number of independent threads, is spawning n threads ideal or more?
@boba I think in this case, if you have n cores and 1 thread for each core, then spawning n threads is ideal. Because the extra threads cannot run in parallel and they may bring more cache misses.
@boba I think there are cases when it is better to spawn more than n threads if the cost of thread management/communication overhead is less than the cost of computing an optimal distribution of work on each thread.
To add to this discussion, I think the number of threads spawned should also depend on what kind of program we are executing. Suppose we had a machine with n cores, if a program is going to make a lot of I/O requests, then restricting yourself to n threads might result in worse performance than having more than n threads because with more than n threads we can issue more requests while the other threads wait for their requests to return as opposed to just n threads where each thread issues its request and must wait during which time no useful work is done before issuing the next request. The flip side is a program that is CPU intensive and in this case we want to eliminate any context switching overhead associated with having more than n threads, so it might be better to spawn n threads
A program running on single-core CPU may benefit from multithread, if it is going to have many I/O operations. The program can switch to another threads when current thread is waiting for I/O.
I think I'd like to bring up Cake's question again because it doesn't seem like anyone has answered it. Is there any scenario where it would be better to manually spawn threads vs letting the compiler determine how many threads to create?
Presuming the compiler creates the maximum number of threads based on the hardware capabilities, conceptually I think it will be better to manually spawn a lesser number of threads than the number of threads the compiler will generate, if the computation time of each thread is much lesser than the synchronization time required between threads.
@anindyag "if the computation time of each thread is much lesser than the synchronization time required between threads" here by synchronizing threads you mean the threads your application generates, right? And if so, how exactly is it better to spawn lesser number of threads than the number of compiler threads?
As a summary of the points discussed above, in order to decide the number of threads we need, we definitely have to consider the nature of the program. If the program has many I/O requests, we can use interleaving threads to hide memory latency. However, we have to bear in mind the overhead caused by context switch, communication and synchronization. If the cost outweighs the benefit, we might want to reduce the number of threads we create.