Previous | Next --- Slide 66 of 79
Back to Lecture Thumbnails

BLAS1 operations are somewhat of this kind. I tried several latency hiding techniques and they all failed.


Are current profilers effective in profiling bandwidth or computation and giving optimization hints? It seems that VTune is a powerful tool. How to do a system wide analysis using Intel® VTune™ Amplifier 2014 for Systems | Intel® Developer Zone