I'm wondering what's the implementation of the builtin reduceAdd. Is it O(N), or O(N/P+P), or O(logN) to add all N numbers?
fleventyfive
I think here, reduceAdd has a constant time implementation, since it is used for every single addition. In my mind, all it does is acquire a lock to update diff, add the absolute value of the value by which A[i,j] has changed, and release the lock.
I think what you have in mind is a reduceAdd that synchronizes at the end of all threads, and adds their individual private copies into a single data value. For that, I believe the complexity will be O(NumberOfThreads).
@qqkk could you please elaborate on the other bounds you came up with, like O(logN)? I'm curious.
qqkk
Sorry I messed up with my question. reduceAdd() should be O(NoOfThread). It's just the implementation here to calculate N numbers, complexity should be O(N/P+P).
ZhuansunXt
Assignment here should be performed statically by the compiler, assigning each iteration to program's instances (workers). For example consider it as ISPC foreach semantic.
I'm wondering what's the implementation of the builtin reduceAdd. Is it
O(N)
, orO(N/P+P)
, orO(logN)
to add allN
numbers?I think here,
reduceAdd
has a constant time implementation, since it is used for every single addition. In my mind, all it does is acquire a lock to update diff, add the absolute value of the value by which A[i,j] has changed, and release the lock.I think what you have in mind is a reduceAdd that synchronizes at the end of all threads, and adds their individual private copies into a single data value. For that, I believe the complexity will be O(NumberOfThreads).
@qqkk could you please elaborate on the other bounds you came up with, like O(logN)? I'm curious.
Sorry I messed up with my question.
reduceAdd()
should beO(NoOfThread)
. It's just the implementation here to calculateN
numbers, complexity should beO(N/P+P)
.Assignment here should be performed statically by the compiler, assigning each iteration to program's instances (workers). For example consider it as ISPC foreach semantic.