to be consistent with Amdahl's law, in this case all tasks are paralleled.. s=0. given N>>p, the work to combine results can be ignored and speedup <= 1/(s+(1-s)/p) = p;
Note that if P is within the same order of numbers as N, this does not hold. As an extreme case, if there are N^2 processors, the overhead of combining sums would be N^2 so there would be at most a 2x speedup. Computing sums in a tree like manner would be better if you have a huge number of processors.