Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2015

Previous | Next --- Slide 31 of 69

afzhang

For anyone else trying to understand the handwritten C + AVX intrinsics implementation, I found this reference guide very useful!

ChandlerBing

I found this to be pretty useful too.

funkysenior15

I can't seem to find a reference for the reduce_add() function, but internally, does it try to parallelize the addition, or does it do a single pass like in the C - AVX implementation on the left?

kayvonf

@funkysenior: https://ispc.github.io/ispc.html#reductions

lament

In the (C U AVX) code, are we assuming 8 wide SIMD?