Are we able to abstract away the NUM_PROCESSORS part from allocate in the same way for_all and foreach does?
This comment was marked helpful 0 times.
arjunh
@idl: I guess you could do so; the idea behind the function call allocate(n+2, n+2, BLOCK_Y, NUM_PROCESSORS) is to specify an assignment strategy regarding how the array elements are assigned to the processors. The strategy suggested as the better option in lecture was a blocked assignment, as it involved less communication among processors (fewer rows have to be sent to the other processors, as each processor already has all the data it needs to perform the algorithm).
This is quite similar to the notion of explicitly using a for loop, with programCount and programIndex, to specify which pieces of work the program instances should perform. We could also abstract away from stating those details and instead use a foreach construct to let the ISPC compiler figure out how to assign the program instances. I'm not sure, but the compiler program should theoretically be able to figure out how to assign the array blocks to the processors as well.
Are we able to abstract away the
NUM_PROCESSORS
part from allocate in the same wayfor_all
and foreach does?This comment was marked helpful 0 times.
@idl: I guess you could do so; the idea behind the function call
allocate(n+2, n+2, BLOCK_Y, NUM_PROCESSORS)
is to specify an assignment strategy regarding how the array elements are assigned to the processors. The strategy suggested as the better option in lecture was a blocked assignment, as it involved less communication among processors (fewer rows have to be sent to the other processors, as each processor already has all the data it needs to perform the algorithm).This is quite similar to the notion of explicitly using a
for
loop, withprogramCount
andprogramIndex
, to specify which pieces of work the program instances should perform. We could also abstract away from stating those details and instead use aforeach
construct to let the ISPC compiler figure out how to assign the program instances. I'm not sure, but the compiler program should theoretically be able to figure out how to assign the array blocks to the processors as well.This comment was marked helpful 2 times.