To maximize the utilization of these resources, do we need to create 16 threads which operate on different parts of data and use Vector Program in the previous slide?
kk
@BigFish Yes, each of the 16 threads can run on a core, and each thread can utilize an 8 wide vector.
To maximize the utilization of these resources, do we need to create 16 threads which operate on different parts of data and use Vector Program in the previous slide?
@BigFish Yes, each of the 16 threads can run on a core, and each thread can utilize an 8 wide vector.