When a gang of program instances is spawned, each instance of the function gets access to two variables:
programCount: gives the number of instances that have been spawned for this function.
programIndex: gives the id of your specific instance.
In the example sin(x) program, these variables are used in the loop to make each instance calculate sin(x[i]) for every "programCount"th (i.e. fourth) value requested, starting from the "programIndex"th index.
hweetvpu
From the ISPC documentation (which is really nice): the set of program instances is mapped to the SIMD lanes of the current processor, leading to excellent utilization of hardware SIMD units and high performance.
The programCount is usually no more than 2-4x the SIMD width, so that the instructions will be converted by ISPC to fit the SSE instruction set width (e.g. 8 instances and 4-wide SSE instruction set).
-o4
The performance benefit of always using uniform when expressing such semantics can be found here.
muchanon
What is the effect of using the uniform modifier unnecessarily? For example, if the float value were specified as uniform, would it cause any issues?
jkorn
uniform indicates a variable that is exactly the same across all program instances. In your case, if we had uniform value instead of float value, I'm not sure if it would be a compile-time error or if it would still run and just be incorrect (perhaps share values across program instances that shouldn't be shared), but the value of value is dependent on what programIndex is, which is a program instance-specific value. So for different program instances, value would take on different values, which by definition contradicts what uniform represents.
ZoSo
I think using a uniform for the float value will generate a compile time error in this case. This is from the intel documentation on this - "It's legal to add two uniform variables together and assign the result to a uniform variable, but assigning a non-uniform (i.e., varying) value to a uniform variable is a compile-time error."
shhhh
Could someone explain to me how this piece of code works? I'm confused on certain parts of it. i is a uniform value, so it's the same over all the threads, no? When i is incremented by programCount this new value should be shared across all the threads, so how do all the threads do different work? Also, what is the programIndex?
shhhh
Nevermind. I had the purpose of programIndex and programCount switched in my head so the code didn't make sense. I got it now.
mak
If I remember correctly, it was mentioned in the class that programCount (i.e. no. of instances in the gang) is specified by the user at compile time. Wouldn't it be better, if it was controlled by ISPC itself, as it has all the necessary info about the HW architecture to make most efficient assignment decison?
When a gang of program instances is spawned, each instance of the function gets access to two variables:
In the example sin(x) program, these variables are used in the loop to make each instance calculate sin(x[i]) for every "programCount"th (i.e. fourth) value requested, starting from the "programIndex"th index.
From the ISPC documentation (which is really nice): the set of program instances is mapped to the SIMD lanes of the current processor, leading to excellent utilization of hardware SIMD units and high performance.
The
programCount
is usually no more than 2-4x the SIMD width, so that the instructions will be converted by ISPC to fit the SSE instruction set width (e.g. 8 instances and 4-wide SSE instruction set).The performance benefit of always using
uniform
when expressing such semantics can be found here.What is the effect of using the uniform modifier unnecessarily? For example, if the float value were specified as uniform, would it cause any issues?
uniform
indicates a variable that is exactly the same across all program instances. In your case, if we haduniform value
instead offloat value
, I'm not sure if it would be a compile-time error or if it would still run and just be incorrect (perhaps share values across program instances that shouldn't be shared), but the value ofvalue
is dependent on whatprogramIndex
is, which is a program instance-specific value. So for different program instances,value
would take on different values, which by definition contradicts whatuniform
represents.I think using a uniform for the float value will generate a compile time error in this case. This is from the intel documentation on this - "It's legal to add two uniform variables together and assign the result to a uniform variable, but assigning a non-uniform (i.e., varying) value to a uniform variable is a compile-time error."
Could someone explain to me how this piece of code works? I'm confused on certain parts of it.
i
is a uniform value, so it's the same over all the threads, no? Wheni
is incremented byprogramCount
this new value should be shared across all the threads, so how do all the threads do different work? Also, what is theprogramIndex
?Nevermind. I had the purpose of
programIndex
andprogramCount
switched in my head so the code didn't make sense. I got it now.If I remember correctly, it was mentioned in the class that programCount (i.e. no. of instances in the gang) is specified by the user at compile time. Wouldn't it be better, if it was controlled by ISPC itself, as it has all the necessary info about the HW architecture to make most efficient assignment decison?