From what I understand about operating system context switches, the OS suspends execution of one thread on the CPU and resuming execution of some other thread that had previously been suspended.(Lets say system only has 1 core with enabled multi-threading) However, in these slides, it seems that each core has control over what thread it wants to devote execution resources to at any given time. My question is about how the OS synchronizes its job handling with the processors.
If the OS suspends a thread that happens to be running on CPU A, does it signal to CPU A to not context switch to that specific thread until the OS resumes it? Does the CPU run suspended threads and just not deliver the result to the OS or does the OS replace the thread running on the core with a new thread and save the old context?
In general, I'm pretty unclear about how hardware-supported multi-threading interacts with software-supported multi-threading.
This comment was marked helpful 3 times.
yixinluo
@vrkrishn First of all, your last question is a very good question, and your understanding about software context switch in the first paragraph is correct.
To answer your question, I should clarify what a hardware context switch is and how it is different from a software one. Hardware Context Switches (HCS as Yixin's Abbreviation), by definition, are triggered by hardware events; Software Context Switches (SCS) are triggered by software. Hardware events (such as last level cache misses) are much more frequent than software events (OS scheduler triggered every 10-100 ms). This is why HCSes are usually made to have lower overhead than software context switches: HCSes make very simple scheduling decision (such as round robin) without any software help (no interrupts) and usually switch between a bounded number of thread contexts stored on chip.
Going back to your question, how does hardware and software thread management interact? As introduced above, HCS usually switches between limited number of thread contexts, let's call them hardware threads. Software thread schedulers assign software threads to hardware threads every 10-100 ms. Then the processor can pick whichever hardware thread they can run and switch quickly between them.
For your questions in second paragraph, you already have the answer in your last question. When OS suspends a thread, it will save the old context to memory then schedule a new thread.
Also see the discussion about hyper-threading on slide 64.
This comment was marked helpful 5 times.
nbatliva
Following from the previous question, it then seems that it is sub-optimal to spawn more threads in a single core than the underlying hardware supports. Is that correct? Not to say that you can't have as many threads as you want, but if you do have more threads than the hardware supports, then the OS does a software intervention of handling all the threads, and swaps thread contexts back and forth between memory and the hardware (slow).
If hyper-threading is enabled, for some reason the v-sync interrupt from the GPU would not be capable of waking the CPU from a sleep state...
buggy hardware FTL...
This comment was marked helpful 0 times.
kayvonf
More modern OS's have system calls that can be used to set the affinity of a thread. A thread's affinity specifies (or strongly hints at) what physical execution resource it should be mapped to. For example, if you are interested in mapping a pthread to a particular processor you may be interested in the pthread library function pthread_getaffinity_np().
You can also set the affinity of a process using sched_setaffinity().
A cool comment from last year:
You can also reconfigure the scheduler using sched_setscheduler. The SCHED_FIFO scheduler is (roughly) a cooperative scheduler, where your thread will run until you block on IO or call sched_yield (or a higher priority process becomes runnable).
This comment was marked helpful 0 times.
kayvonf
If you log into the machines in Gates 5201/5205 and type less /proc/cpuinfo you'll get a report about the processor. The Intel CPUs in those machines have four cores, but hyper-threading is disabled in their current configuration, so the cpuinfo report states there are four cores.
The Intel CPUs in the Gates 3000 machines have six cores, but hyper-threading is enabled. You'll find cpuinfo will report 12 logical cores although there are only 6 physical ones. In the terms I used in class, the CPU has 6 cores, but manages up to 12 independent hardware thread execution contents.
This comment was marked helpful 0 times.
yihuaz
I'm a little confused about simultaneous multi-threading. For Intel Hyper-threading (2 threads per core), does that mean there are two execution contexts in the core, and two threads can run in parallel so that there is no context switch?
This comment was marked helpful 0 times.
kayvonf
@yihuaz: Please see the discussion about hyper-threading on slide 64.
From what I understand about operating system context switches, the OS suspends execution of one thread on the CPU and resuming execution of some other thread that had previously been suspended.(Lets say system only has 1 core with enabled multi-threading) However, in these slides, it seems that each core has control over what thread it wants to devote execution resources to at any given time. My question is about how the OS synchronizes its job handling with the processors.
If the OS suspends a thread that happens to be running on CPU A, does it signal to CPU A to not context switch to that specific thread until the OS resumes it? Does the CPU run suspended threads and just not deliver the result to the OS or does the OS replace the thread running on the core with a new thread and save the old context?
In general, I'm pretty unclear about how hardware-supported multi-threading interacts with software-supported multi-threading.
This comment was marked helpful 3 times.
@vrkrishn First of all, your last question is a very good question, and your understanding about software context switch in the first paragraph is correct.
To answer your question, I should clarify what a hardware context switch is and how it is different from a software one. Hardware Context Switches (HCS as Yixin's Abbreviation), by definition, are triggered by hardware events; Software Context Switches (SCS) are triggered by software. Hardware events (such as last level cache misses) are much more frequent than software events (OS scheduler triggered every 10-100 ms). This is why HCSes are usually made to have lower overhead than software context switches: HCSes make very simple scheduling decision (such as round robin) without any software help (no interrupts) and usually switch between a bounded number of thread contexts stored on chip.
Going back to your question, how does hardware and software thread management interact? As introduced above, HCS usually switches between limited number of thread contexts, let's call them hardware threads. Software thread schedulers assign software threads to hardware threads every 10-100 ms. Then the processor can pick whichever hardware thread they can run and switch quickly between them.
For your questions in second paragraph, you already have the answer in your last question. When OS suspends a thread, it will save the old context to memory then schedule a new thread.
Also see the discussion about hyper-threading on slide 64.
This comment was marked helpful 5 times.
Following from the previous question, it then seems that it is sub-optimal to spawn more threads in a single core than the underlying hardware supports. Is that correct? Not to say that you can't have as many threads as you want, but if you do have more threads than the hardware supports, then the OS does a software intervention of handling all the threads, and swaps thread contexts back and forth between memory and the hardware (slow).
This comment was marked helpful 0 times.
A funny note about hyper-threading:
In one instance Intel didn't implement it completely bug free: https://bugs.freedesktop.org/show_bug.cgi?id=30364
If hyper-threading is enabled, for some reason the v-sync interrupt from the GPU would not be capable of waking the CPU from a sleep state...
buggy hardware FTL...
This comment was marked helpful 0 times.
More modern OS's have system calls that can be used to set the affinity of a thread. A thread's affinity specifies (or strongly hints at) what physical execution resource it should be mapped to. For example, if you are interested in mapping a pthread to a particular processor you may be interested in the pthread library function
pthread_getaffinity_np()
.You can also set the affinity of a process using
sched_setaffinity()
.A cool comment from last year:
You can also reconfigure the scheduler using
sched_setscheduler
. TheSCHED_FIFO
scheduler is (roughly) a cooperative scheduler, where your thread will run until you block on IO or callsched_yield
(or a higher priority process becomes runnable).This comment was marked helpful 0 times.
If you log into the machines in Gates 5201/5205 and type
less /proc/cpuinfo
you'll get a report about the processor. The Intel CPUs in those machines have four cores, but hyper-threading is disabled in their current configuration, so the cpuinfo report states there are four cores.The Intel CPUs in the Gates 3000 machines have six cores, but hyper-threading is enabled. You'll find cpuinfo will report 12 logical cores although there are only 6 physical ones. In the terms I used in class, the CPU has 6 cores, but manages up to 12 independent hardware thread execution contents.
This comment was marked helpful 0 times.
I'm a little confused about simultaneous multi-threading. For Intel Hyper-threading (2 threads per core), does that mean there are two execution contexts in the core, and two threads can run in parallel so that there is no context switch?
This comment was marked helpful 0 times.
@yihuaz: Please see the discussion about hyper-threading on slide 64.
This comment was marked helpful 0 times.