Previous | Next --- Slide 27 of 79
Back to Lecture Thumbnails
ametoki

As told in class, AVX does "vectorized" arithmetic calculations on special 256-bit registers (for instance), effectively handling multiple (i.e. 8 32-bit float, 4 64-bit double) normal calculations with just one instruction.

My question is, how is an AVX instruction carried out in hardware? In the example of a 256-bit add of 8 floats, do all the 8 ALUs (of our imaginary cpu) carry out the calculations simultaneously?