This new representation of Amdahl's law takes into account heterogeneous parallelism. Comparing back to the earlier formula, we now have to consider systems which may have cores with varying resources per core.
So the unit of perf(r) is time if I am thinking correctly right?
This comment was marked helpful 0 times.
kayvonf
@shabnam: perf(r) is a rate. If a processing core with r units of resources can process work at a rate of 1 (e.g., one instruction per clock). Then by this very, very simple model a core with 'r' units of resources can process sqrt(r) instructions per clock.
Throughout the lecture I've assumed perf(r) = 1. And lets assume a program is a total of 1 units of work. Then the numerator in the equation above work/workrate = 1/1 = 1 = time to complete job sequentially with a core where r=1. The denominator is (sequential_work / sequential_work_rate + parallel_work / parallel_work_rate) = the time to complete the job on the heterogeneous processor.
This comment was marked helpful 0 times.
RICEric22
Since perf(r) is modeled as sqrt(r), if we total the performance with a fix number of resources, we would get more performance by giving each core a single unit with r performance, whereas combining resources in a single core gives sublinear performance, and thus less than r performance.
This new representation of Amdahl's law takes into account heterogeneous parallelism. Comparing back to the earlier formula, we now have to consider systems which may have cores with varying resources per core.
Also on a tangent, I found this article pretty cool - http://developer.amd.com/resources/heterogeneous-computing/what-is-heterogeneous-computing/
This comment was marked helpful 0 times.
So the unit of
perf(r)
is time if I am thinking correctly right?This comment was marked helpful 0 times.
@shabnam:
perf(r)
is a rate. If a processing core withr
units of resources can process work at a rate of 1 (e.g., one instruction per clock). Then by this very, very simple model a core with 'r' units of resources can processsqrt(r)
instructions per clock.Throughout the lecture I've assumed
perf(r) = 1
. And lets assume a program is a total of 1 units of work. Then the numerator in the equation above work/workrate = 1/1 = 1 = time to complete job sequentially with a core where r=1. The denominator is (sequential_work / sequential_work_rate + parallel_work / parallel_work_rate) = the time to complete the job on the heterogeneous processor.This comment was marked helpful 0 times.
Since perf(r) is modeled as sqrt(r), if we total the performance with a fix number of resources, we would get more performance by giving each core a single unit with r performance, whereas combining resources in a single core gives sublinear performance, and thus less than r performance.
This comment was marked helpful 0 times.