Previous | Next --- Slide 21 of 38
Back to Lecture Thumbnails
unihorn

In my view, another discussion on the workload balance on SIMD is also necessary. Because they run instructions lockstep, the running time is decided by the slowest one, while others have to wait for it. So it is better for us to schedule the similar-length ones together to use SIMD. But it may increase the cost in pre-process or memory loading for such optimization.

kailuo

Your discussion also reminds me of branch prediction. Remember if the instructions run on a SIMD unit has if...else statements, both branches will be calculated by all SIMD units, and the result of the invalid ones will be discarded. Although branch prediction is an important area studied by many, it's still not perfect. Hence it might also help if the programmer keeps Amdahl's law in mind while coding and make the program easier to be split into balanced parts (e.g. reduce the number of branches so SIMD processing can be more efficient).