Why do the CPUs cap out at 95W while the GPUs cap out at 250W? If the GPUs are not melting why would more power melt a CPU. It seems almost as if the GPUs are pushing the boundaries of power consumption more than CPUs. Can they not continue to increase power consumption and clock frequency in CPUs?
This comment was marked helpful 0 times.
gif
We will cover this in more detail in future lectures so hopefully you will be able to answer your own question in a few weeks. To give you some idea of what to think about, know that GPUs are designed to solve specific types of parallel workloads and so they can make a bunch of assumptions about what type of programs they should optimize for. On the other hand CPUs are still designed to solve problems in a world largely dominated by sequential code and should perform well on any reasonable program. With that in mind, if you know the workloads are more parallel and having the equation above, can you see how GPUs might be able to get around some of the constraints on CPUs?
This comment was marked helpful 1 times.
dfarrow
This is an interesting question, and I would like to know the answer too. I'll take a stab at it though, so here's my best guess:
Maybe die size has something to do with it. A Haswell CPU has a 177 mm^2 die, and a GTX 780 GPU has a 561 mm^2 die. I would assume that capacity for heat transfer (preventing the chip from melting) increases as surface area increases; or on the flip side, perhaps junction temperature increases as surface area decreases.
TL;DR: I think maybe you can pump more power into a GPU because it's easier to keep it cool.
This comment was marked helpful 0 times.
kayvonf
@mofarrel and @dfarrow, I'd like to hear your thoughts on this question after the next lecture.
This comment was marked helpful 0 times.
ycp
Would it be unrealistic to have some sort of "cooling units" that could keep the chips cooler so they could use more power? I'm sure people have thought of this, but I'm not sure why it's not a good idea.
This comment was marked helpful 0 times.
Orangensaft
I would assume it's because it's just too much power. If you have a computer already drawing as much power as half a microwave and then you add and huge cooler to cool the computer, then you would be using so much energy. Energy is a resource too, so it's important to make efficient use of it.
This comment was marked helpful 0 times.
dfarrow
@ycp on a PC (assuming you mean something more effective than just a simple heatsink+fan) that's definitely an option; I'm thinking about liquid cooling in particular. You can actually find a liquid cooling kit for relatively cheap, $50 here, but they can get pretty expensive. I think liquid cooling is probably used almost exclusively by the hardcore-gaming/enthusiast/special-purpose market though - by those who are into overclocking and are comfortable with building their PCs from scratch.
For mobile phones, tablets, etc., I think we're pretty much stuck with passive cooling, and it is hard to want to overclock my phone when my hand is the heatsink.
This comment was marked helpful 2 times.
yetianx
The bigger the surface it has, the easier it will be cooling down. But modern computer industry is pursuing smaller chips, which increase the difficulty to cool down.
With the increased use of centralized data centers that keep servers cool on a massive scale and allow distributed computing over a network instead of computing everything locally, would this mitigate the need to worry about power consumption and cooling individual PC's and their chips?
This comment was marked helpful 0 times.
dfarrow
@kayvonf, I'm still not entirely certain why there's such a large TDP difference between consumer CPUs and GPUs. I'm trying to think in terms of architecture now. I guess the question is: how does one architecture (GPU) allow the chip to have a higher TDP without melting than another (CPU). I totally appreciate that the GPU is executing a ton of instructions simultaneously, so it makes sense that it would give off more heat than a CPU would per cycle. But it seems to me like the TDP of a processor shouldn't depend on the specific architecture of that processor, but rather on how much power you can give it without it melting. Which leads me to the "greater surface area = more heat transfer = higher TDP" theory. Any other hints?
This comment was marked helpful 0 times.
paraU
I can't see the relationship between the frequency and voltage in the formula. Power is a linear function to frequency. And quadratic to voltage. How is voltage and frequency related?
This comment was marked helpful 0 times.
yixinluo
@paraU There is not a direct relation between voltage and frequency. You can scale the frequency of a processor independent of the voltage. However, the peak frequency of the processor is decided by the voltage, as you can tell from slide 28. If the frequency is set above peak frequency, the processor can malfunction due to circuit failure.
This comment was marked helpful 0 times.
gif
@dfarrow Since GPUs do not have the same requirements with respect to sequential performance, can you think about what kind of constraints on CPU manufacturers have been lifted for GPU manufacturers? Think about why GPUs can afford a larger surface area.
This comment was marked helpful 0 times.
benchoi
CPUs, due to their need for higher sequential performance, need to run at much higher clock rates than GPUs. How does this link to running hotter though? It doesn't seem that CPUs run at significantly higher voltages than GPUs.
This comment was marked helpful 0 times.
Leomabiao
?If this comment is not too late? In Cloud Computing Course, we learnt about that cost by the cooling system for a data center is significant; so I am thinking about what are the key tradeoffs here? (Heat-generation VS Frequency VS Number_of_Cores ?)
This comment was marked helpful 0 times.
Leomabiao
The fact we know is the expense from cooling system of a data center is a significant portion of the total expense. So that anyone could share the idea of key tradeoffs here?
This comment was marked helpful 0 times.
putthatcookiedown
@Benchoi TDP isn't directly linked to temperature. CPUs and GPUs actually run at similar temperature (around 65 C). TDP stands for Thermal Design Power, and it is a measure of how much power in watts is being dissipated by the chip via heat energy. While at any single point the temperature of a CPU or GPU are the same, because the GPU is larger, its surface is collectively dissipating more heat energy. Additionally, while TDP is a measure of heat energy it is also strongly correlated to power consumption by the chip, due to energy conservation laws.
Why do the CPUs cap out at 95W while the GPUs cap out at 250W? If the GPUs are not melting why would more power melt a CPU. It seems almost as if the GPUs are pushing the boundaries of power consumption more than CPUs. Can they not continue to increase power consumption and clock frequency in CPUs?
This comment was marked helpful 0 times.
We will cover this in more detail in future lectures so hopefully you will be able to answer your own question in a few weeks. To give you some idea of what to think about, know that GPUs are designed to solve specific types of parallel workloads and so they can make a bunch of assumptions about what type of programs they should optimize for. On the other hand CPUs are still designed to solve problems in a world largely dominated by sequential code and should perform well on any reasonable program. With that in mind, if you know the workloads are more parallel and having the equation above, can you see how GPUs might be able to get around some of the constraints on CPUs?
This comment was marked helpful 1 times.
This is an interesting question, and I would like to know the answer too. I'll take a stab at it though, so here's my best guess:
Maybe die size has something to do with it. A Haswell CPU has a 177 mm^2 die, and a GTX 780 GPU has a 561 mm^2 die. I would assume that capacity for heat transfer (preventing the chip from melting) increases as surface area increases; or on the flip side, perhaps junction temperature increases as surface area decreases.
TL;DR: I think maybe you can pump more power into a GPU because it's easier to keep it cool.
This comment was marked helpful 0 times.
@mofarrel and @dfarrow, I'd like to hear your thoughts on this question after the next lecture.
This comment was marked helpful 0 times.
Would it be unrealistic to have some sort of "cooling units" that could keep the chips cooler so they could use more power? I'm sure people have thought of this, but I'm not sure why it's not a good idea.
This comment was marked helpful 0 times.
I would assume it's because it's just too much power. If you have a computer already drawing as much power as half a microwave and then you add and huge cooler to cool the computer, then you would be using so much energy. Energy is a resource too, so it's important to make efficient use of it.
This comment was marked helpful 0 times.
@ycp on a PC (assuming you mean something more effective than just a simple heatsink+fan) that's definitely an option; I'm thinking about liquid cooling in particular. You can actually find a liquid cooling kit for relatively cheap, $50 here, but they can get pretty expensive. I think liquid cooling is probably used almost exclusively by the hardcore-gaming/enthusiast/special-purpose market though - by those who are into overclocking and are comfortable with building their PCs from scratch.
For mobile phones, tablets, etc., I think we're pretty much stuck with passive cooling, and it is hard to want to overclock my phone when my hand is the heatsink.
This comment was marked helpful 2 times.
The bigger the surface it has, the easier it will be cooling down. But modern computer industry is pursuing smaller chips, which increase the difficulty to cool down.
This comment was marked helpful 0 times.
@ycp When people push the limits of chips, they frequently use super cooled liquid gases.
This comment was marked helpful 0 times.
With the increased use of centralized data centers that keep servers cool on a massive scale and allow distributed computing over a network instead of computing everything locally, would this mitigate the need to worry about power consumption and cooling individual PC's and their chips?
This comment was marked helpful 0 times.
@kayvonf, I'm still not entirely certain why there's such a large TDP difference between consumer CPUs and GPUs. I'm trying to think in terms of architecture now. I guess the question is: how does one architecture (GPU) allow the chip to have a higher TDP without melting than another (CPU). I totally appreciate that the GPU is executing a ton of instructions simultaneously, so it makes sense that it would give off more heat than a CPU would per cycle. But it seems to me like the TDP of a processor shouldn't depend on the specific architecture of that processor, but rather on how much power you can give it without it melting. Which leads me to the "greater surface area = more heat transfer = higher TDP" theory. Any other hints?
This comment was marked helpful 0 times.
I can't see the relationship between the frequency and voltage in the formula. Power is a linear function to frequency. And quadratic to voltage. How is voltage and frequency related?
This comment was marked helpful 0 times.
@paraU There is not a direct relation between voltage and frequency. You can scale the frequency of a processor independent of the voltage. However, the peak frequency of the processor is decided by the voltage, as you can tell from slide 28. If the frequency is set above peak frequency, the processor can malfunction due to circuit failure.
This comment was marked helpful 0 times.
@dfarrow Since GPUs do not have the same requirements with respect to sequential performance, can you think about what kind of constraints on CPU manufacturers have been lifted for GPU manufacturers? Think about why GPUs can afford a larger surface area.
This comment was marked helpful 0 times.
CPUs, due to their need for higher sequential performance, need to run at much higher clock rates than GPUs. How does this link to running hotter though? It doesn't seem that CPUs run at significantly higher voltages than GPUs.
This comment was marked helpful 0 times.
?If this comment is not too late? In Cloud Computing Course, we learnt about that cost by the cooling system for a data center is significant; so I am thinking about what are the key tradeoffs here? (Heat-generation VS Frequency VS Number_of_Cores ?)
This comment was marked helpful 0 times.
The fact we know is the expense from cooling system of a data center is a significant portion of the total expense. So that anyone could share the idea of key tradeoffs here?
This comment was marked helpful 0 times.
@Benchoi TDP isn't directly linked to temperature. CPUs and GPUs actually run at similar temperature (around 65 C). TDP stands for Thermal Design Power, and it is a measure of how much power in watts is being dissipated by the chip via heat energy. While at any single point the temperature of a CPU or GPU are the same, because the GPU is larger, its surface is collectively dissipating more heat energy. Additionally, while TDP is a measure of heat energy it is also strongly correlated to power consumption by the chip, due to energy conservation laws.
This comment was marked helpful 0 times.