Previous | Next --- Slide 46 of 65
Back to Lecture Thumbnails
pd43

How does the latency of a stall on memory compare with the latency of a context switch? It was mentioned at some point that we can approximate a cache miss as ~100 clock cycles.

alex

Two things: a context switch in the OS sense is very different than a context switch in this sense. The former changes the thread that is running on a CPU and the latter changes between thread contexts in the same processor. There is a subtle distinction there.

Also, I just found this awesome infographic where you can see the differences between some latencies over the past several years.

edit: It is important to realize that this slide probably does not use thread in the OS thread sense but rather in the hardware thread context sense. The word thread is somewhat overloaded and will mean different things at different points in this course - we will attempt to address some of the nuance but make sure you pay attention to this.

kayvonf

In the context of interleaved hardware multi-threading, you can safely think of a processor being able to switch the active execution context (a.k.a. active thread) every cycle.

Keep in mind that from the perspective of the OS, a multi-threaded processor is simply capable of running N kernel threads. In most cases, the OS treats each hardware execution context as a virtual core and it remains largely oblivious to the fact that the threads running within these execution contexts are being time multiplexed on the same physical core.

The OS does, however, need to have some awareness about how kernel threads are mapped to processor execution contexts. Consider the processor above, which has four cores, and can interleave execution of four logical threads on each core. Now consider a program that spawns four pthreads. It would certainly be unwise for the OS to associate the four pthreads onto four execution contexts on the same physical core! Then one core would be busy running the threads (interleaving their execution on its resources) and the other three would sit idle!

Question: For those of you that have taken 15-410, can you list all that must happen for an OS to perform a content switch between two processes (or kernel threads)? It should quickly become clear that an OS process context switch involves quite a bit of work.

lkung

OS process context switching involves the overhead of storing the context of the process you're switching from (A) and then restoring the context of the process you want to switch to (B).

  • saving A's registers
  • restoring B's saved registers
  • re-scheduling A, setting B as running
  • loading B's PCB
  • switching to B's PDBR, which usually flushes the TLB (resulting in misses when translating virtual -> physical addresses as the cache warms up again)

The 410 book (Operating System Concepts) says that typical context switch speeds are a few milliseconds (I have the 7th ed, so it may be a bit outdated). "It varies from machine to machine, depending on the memory speed, the number of registers that must be copied, and the existence of special instructions (such as a single instruction to load or store all register). Context-switch times are highly dependent on hardware support. For instance, some processors (such as the Sun UltraSPARC) provide multiple sets of registers. A context switch here simply requires changing the pointer to the current register set."

dmatlack

The state of the running process must be saved and the state of the next process must be restored:

  1. Storing all register values of the currently running process somewhere in kernel memory and then populating registers with the state of the next process. Included in the "state" of a process is all general purpose registers, the program counter, eflags register, data segment registers, etc.
  2. The page directory and tables must be swapped (which requires wiping the TLB).

Preceding this save-and-restore of state there is also work to be done in order to select the next runnable thread (scheduling) and some synchronization.