Slide View : 15-418 Spring 2013

Heterogeneous Parallelism and HW Specialization

Previous | Next --- Slide 28 of 34

Incanam

The idea here is that chip designers want to be conservative, because if they are conservative and wrong, they misuse ~1% of the chip (in this example). IF they're conservative and devote more chip space to rasterization than is needed, they have only wasted that extra chip space devoted to rasterization. However, if they are not conservative and wrong, then they have wasted a much bigger portion of the chip because the rasterization portion serves as a bottleneck (leaving everything else not fully utilized). It's a smarter bet to be conservative.

This comment was marked helpful 0 times.

markwongsk

Is there a concrete calculation as to how 99% -> 80%?

This comment was marked helpful 0 times.

edwardzh

I'm not sure how to get 80% either.

If we thought we needed 1% of the chip for rasterization, and we called that 100% efficiency, then if we end up wanting 1.2x that amount, it seems we have 20% more rasterization workload than expected. Then the rasterization step is taking 1.2x as long. If we assume the rasterization step is always the bottleneck, then 0.2/1.2 of the time the rest of the chip is doing nothing.

Then the efficiency of the rest of the chip is (working time)/(total time) = 1/1.2 = 83.33%.

This comment was marked helpful 0 times.

tpassaro

Disregarding the efficiency computation, I've read of large core chips which know the utilization of cores, and will power down cores which are not needed. This seems like a load balancer, in essence, and I wonder if this could be put to use in GPUs, where there are definitely a large number of cores, all of which may not be doing any work at all.

Further, to help GPU utilization, new NVIDIA GPUs have the ability to run more than a single kernel at once. This might be able to utilize more of the GPU since one kernel does not lock the device from being used.

This comment was marked helpful 0 times.