Question: This would be a good slide for someone to describe in their own words. How does this figure illustrate a combination of simultaneous multi-threading and interleaved multi-threading?
The system has two fetch/decode units and ALUs, hence two independent instruction streams can run in parallel. Moreover, the four execution contexts make it possible to interleave 4 different instruction streams. Thus, 4 hardware-threads can run concurrently on this core while two can run in parallel. As seen in the figure, the four threads are using the four execution contexts concurrently (using interleaved multi-threading). For example: in the initial set of clocks, the first instances of threads 0 and threads 1, 2, 3 are running in parallel on two ALUs using the simultaneous multi-threading, out of which threads 1, 2 and 3 exhibit interleaved multi-threading on the same ALU.
Thus by combining ideas of simultaneous and interleaved multi-threading, chances of finding independent instruction streams increases and allows the system to draw more ILP, with some cheap hardware-level context switching.
A silly way of finding the number of ALU's in the system: draw vertical lines and see the maximum number of intersections with the threads. Here, it is 2 => There were 2 ALU+Fetch units.
I agree with pk267. By drawing vertical lines we can tell the minimum number of ALUs and Fetch units. Threads 2 and 3 demonstrate interleaved multi-threading since no two of them run at the same time.
However, I'm not convinced if two has to be the exact number of ALU units or not. Would it not be possible that there are more than two ALUs, but for some other reason more instructions are not able to be run in parallel? Like dependencies on code?