Slide View : Parallel Computer Architecture and Programming : 15-418/618 Spring 2017

nemo

For check 2, superscalar execution entails multiple-instruction streams due to instruction level parallelism. Correct?

pdp

@nemo: I think for check 2, it is a single instruction stream if the superscalar execution is performed on a single core processor, otherwise it is not. It really depends on the processor architecture. Quoting from wikipedia: In Flynn's taxonomy, a single-core superscalar processor is classified as an SISD processor (Single Instruction stream, Single Data stream), though many superscalar processors support short vector operations and so could be classified as SIMD (Single Instruction stream, Multiple Data streams). A multi-core superscalar processor is classified as an MIMD processor (Multiple Instruction streams, Multiple Data streams).

When we spoke about this in the first lecture, we were probably referring to single instruction stream only. To add to this, I think superscalar execution performed single instruction stream but out-of-order execution (Instruction level parallelism).

axiao

@nemo: Slide 72 has a good example of superscalar execution. There is only a single instruction stream in that example, but with 2 fetch/decode units to allow for execution of 2 independent instructions from the single instruction stream.

apr

For check 1, single instruction stream refers to a single sequence of instructions to be executed whose results should match if those instructions were to be executed in the given order 1 by 1. This single instruction stream performance could be optimized by increasing clock speed (limited by heat dissipation and power consumption issues even though transistors were getting smaller and faster) OR by finding independent instructions in this stream that can be executed in parallel by making use of multiple fetch & decode and execute units (ILP). ILP logic could be implemented to use up extra area on chip due to the reducing size of transistors however could not give the required returns because of the difficulty in finding independent instructions in a program. This was mainly due to the ease of iterative programming for programmers, resulting in mostly iterative logic, some of which cannot be unrolled by the processor.

paracon

@pdp, A single-core processor supporting simultaneous multi-threading can execute multiple instructions on multiple data. That would make a single-core processor MIMD. Please correct me if I am wrong!

sadkins

A single-instruction stream is a sequence of instructions that appear to be sequential(not parallel) in code. The processor may use things like ILP and out of order logic to speed up the execution but the result of the instruction stream is the same as if it were run sequentially.

o_o

The things that prevented us from obtaining maximum speedup are mainly: 1. There was an unequal distribution of work where some processors were idle while others were still doing work. 2. There is communication overhead from all the processors trying to pass information to each other. 3. The task of partitioning the work and merging the work takes work, which takes time as well.

kayvonf

@paracon. I'd agree with that.

sampathchanda

For 1, single instruction stream is a single sequence of instructions that are roughly required to be executed in a sequential order (roughly, since this doesn't hold strictly when considering out of order execution). Performance of a single instruction stream has not improved a lot because the complexity of the optimizations and the dependence on the sequential processing. It could be perceived to be little easier now (from hardware architecture side) to get better performance, as with cases like GPU, simpler processing units put in parallel execution mode tends to provide better speedup than improvising a single core performance. Please correct me if am wrong !