Previous | Next --- Slide 29 of 36
Back to Lecture Thumbnails
bwasti

Intel's new Haswell architecture, which stresses low power consumption, is built for mobile devices. Does the development of low power chips help frequency scaling? How does this relationship work? (I'm guessing there are diminishing returns in power reduction as well.)

yixinluo

@bwasti Your question is unclear to me, I think you are confused about the relation between P(ower), V(oltage), and f(requency). As stated in slide 27, P is proportional to V^2*f. In addition to that, f(peak) of a CPU is proportional to V. So if you set a CPU at its peak frequency (why not), its frequency (read performance) is proportional to V, while power consumption is proportional to V^3, its efficiency goes down by V^2. As you can see, the performance and the efficiency of the CPU are both decided by the voltage applied to it. As a result, using an ultra low power CPU is not a solution to improve performance, but in fact a compromise to battery life, CPU efficiency, and cooling. I hope this helps answering your question.

taegyunk

I don't quite understand how low-frequency multi-core processors are faster than extremely high-frequency single-core processors. They would might give better user experience when the user runs multiple programs on his/her PC. If we just compare single cores from each type of processors, high-frequency single-core processor would win. I would need some more elaboration on the phrase "Architects are now building faster processors by adding processing cores" in the lecture slide.

iamk_d___g

@taegyunk: I think if there is only one thread running, the single core must win. But this is not the case: OS already has many processes running together, and let alone the applications. Furthur more, we can now parallelize our traditional programs to get a speedup.

arjunh

@taegynk: The main idea is that computer chip architects are now using the extra transistors provided as per Moore's Law to add additional cores to chips, rather than making the pre-existing cores more powerful.

A powerful single core would have numerous facilities to speed up the parts of its execution (namely the process of fetching/decoding instructions, executing them and storing those results in registers). For example, you could have a smarter fetching mechanism, which can determine if there is ILP in the instruction stream, or you could have a caching facility to store frequently used values that can't fit in the set of registers.

Instead of going for these single-core enhancements, chip architects today instead decide to use the extra transistors to add extra cores. This means that any single core on a chip with multiple cores is less powerful than any single core on a chip with one enhanced core. However, in practice, chip designers are realizing that the extra parallelism realized by using multiple cores leads to significant speedup; it's now up to programmers exploit these extra cores with code that can be readily parallelized, with as few overheads as possible.

kayvonf

@taegyunk: Consider the following case. Let's say the Pittsburgh airport is expecting a huge crowd of CMU students trying to get out to the west coast ASAP because they want to join Nest before it gets sold to Google for $3.2B dollars. (Oops, too late!)

You are the security manager of the airport, and you have funding budget of \$50/hr to pay all your TSA workers at the security check. You can tell worker A to come to work, who is a top-notch senior employee and can get 100 people through her line in an hour. However, due to her seniority and consistently high performance, she makes \$50/hr. Or you can tell workers B and C to come to work who are less experienced, and can only get 75 people each through their lines an hour. However, since they are new they each make only make \$25/hr.

Who do you tell to come to work today?

taegyunk

@kayvonf: Definitely, hiring two less experienced would be better and that explains well! Thank you!

paraU

Transistors are still shrinking, so we can use more transistors on a single chip. This makes sense. Then we have two choices, on a single core or multi cores. The answer is multi cores because with multi cores we can do multi tasks at the same time. But my problem is what is the relationship between transistor numbers and core performance. In my understanding, core performance is only correlated with frequency. But I'm quite curious about what the core would be if we give it more transistors.

yixinluo

@paraU If you recall slide 22/23/25, other than frequency, performance is also related with ILP. Here is an equation that explains this:

Instrs/Second = (Instrs/Cycle) * (Cycles/Second)

Performance = ILP * Frequency

Given a single application, the processor's architecture decides how much ILP it can exploit. And more transistors are always a trade off for better architecture. Recall the example from slide 25, processing more instructions per cycle can improve ILP, however, it requires additional logic (implemented with transistors) to process multiple instructions. And the complexity of these logic increases exponentially with ILP it can exploit because you need to handle dependencies among more instructions.

Dave

I'm sure I'm way oversimplifying things, but here's my understanding: with, say, a 22nm process, Intel figures it can put 1.5 billion transistors on a chip. To keep energy consumption and heat dissipation down to something reasonable, they can't possibly push the chip's clock speed past 3.5 GHz. But the wizards at Intel can basically divide up their fixed number of transistors into some combination of redundancy (for parallelism) and features (like dedicated hardware for specific operations and quicker fetch/decode logic.)

Here's my question, though: if there are some applications that are stubbornly serial, why doesn't Intel have any processors that devote practically all available transistors to single-threaded performance? It seems like the fastest single-threaded performance you can get from Intel today is in the i7-4960X, a 6-core chip! Turbo Boost can turn off the 5 unused cores and overclock the remaining core when doing serial-heavy tasks, but wouldn't that speed bump pale in comparison to the benefit of 4 of 5 cores' worth of extra transistors? I get that more and more programs are written for parallel processors, but would it be fair to say Intel has forsaken the ones that don't?