Previous | Next --- Slide 12 of 47
Back to Lecture Thumbnails
chaominy

Question: What does the prefix _mm256 mean?

sjoyner

It uses 256 bit registers. Reference.

Mayank

Based on the implementation of ISPC, it seems that there will never be a need for barrier synchronization or join within an ISPC function, as all program instances execute the same instructions (with some SIMD lanes turned off in case of divergence). Is this understanding correct? However, since we can have more program instances than number of SIMD lanes, how is this ensured? Or do we have to use specialized primitives like reduceAdd whenever we want some kind of barrier synchronization?

mmp

ISPC essentially guarantees that the instances within a gang will be re-synchronized after each program statement (more specifically, at "sequence points" in the program, which is a concept that comes from the C standard). See some more discussion here: http://ispc.github.com/ispc.html#data-races-within-a-gang.

Thus, there isn't any need for barrier synchronization or the like between the program instances in a gang. If one is also executing across multiple cores, then ISPC currently only supports launching tasks and waiting for tasks to complete; the tasks aren't supposed to coordinate with each other while executing.

It might be worthwhile for ISPC to support more options for expressing computation across multiple cores, in which case barrier sync or the like would be useful...