Previous | Next --- Slide 11 of 51
Back to Lecture Thumbnails
mak

What kind of operating system does a super computer run?

eosofsky

What kind of operating system events need to be eliminated? Are they eliminated or just delayed?

1_1

If one of these nodes requires extra amount of time, then the whole thing slows down. In fact, when they run these machines, they turn off operating system features that could cause "noise". Want to ensure top speed at all times and avoid possible glitches.

jedi

@mak, from what I've read, supercomputers run lighweight Linux-derivatives. The CNK kernel used by IBM Blue Gene (at Argonne National Laboratory) has:

  1. statically mapped physical memory
  2. no context switching or scheduling at the compute kernel
  3. no file I/O implementation on compute nodes

15712 talks about microkernels like Mach

googlebleh

@eosofsky There's many things going on in a traditional OS that aren't required in a supercomputer. If you know your machine is dedicated to performing one large compute job, you can actually remove kernel functionality that supports anything else.

For example, general-purpose OSs have a timer "tick" that goes off every, say 10ms, that the kernel uses to schedule different processes to run on the CPU. When running a long compute job for which the programmer has partitioned work appropriately, this can cause significant slowdown.

Other examples include the networking stack (and associated daemon processes for that), desktop environments, and others. Technically, HW interrupts shouldn't detract from performance, but since these machines don't have disks, extra code in the kernel is just wasted space in RAM that could be used for computation.

atadkase

Does each node have its own OS? Or is there like a single OS governing the entire cluster?