Previous | Next --- Slide 11 of 63
Back to Lecture Thumbnails
bpr

These results are collected on a shared memory system. Would switching the hardware to be distributed have any effect on the results?

nba16235

Could someone explain why the 4D blocked data layout can reduce the waiting time due to barriers?

EggyLv999

@nba The waiting time due to barriers is due to some threads taking longer than others. By reducing the mean time to get memory from the cache, we've also reduced its variance, which means that the slowest thread won't take as long on average to complete.