How to use Open MPI
By alex

Open MPI is a little bit tricky to use. Hopefully, this article will offer some tips on how to use Open MPI on the ghc machines.

Getting set up

To run Open MPI programs on Gates machines, you need to set two environment variables:

  • LD_LIBRARY_PATH must include /usr/lib64/openmpi/lib
  • PATH must include /usr/lib64/openmpi/bin, and /usr/lib64/openmpi/bin must appear before /usr/local/bin in your PATH.

Compiling

We're going to play around with this simple Open MPI program (you have to fix a few lines due to awkward web pasting):

#include <mpi.h>
#include <stdio.h>
int main(int argc, char** argv) {
  // Initialize the MPI environment
  MPI_Init(NULL, NULL);

  // Get the number of processes.
  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &world;_size);

  // Get the rank of the process.
  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &world;_rank);

  // Get the name of the processor.
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processor_name, &name;_len);

  // Print off a hello world message.
  printf("Hello world from processor %s, rank %d"
         " out of %d processors\n",
         processor_name, world_rank, world_size);

  // Finalize the MPI environment.
  MPI_Finalize();
}

We compile this program with1 :

areece@ghc68$ mpic++ mpi.c -o hello_mpi

And now we're ready to run!

Running on one machine

To run on one machine, we use mpirun. We can control the number of processes launched using the parameter -np as follows:

areece@ghc68$ mpirun -np 6 hello_mpi
Hello world from processor ghc68.ghc.andrew.cmu.edu, rank 1 out of 6 processors
Hello world from processor ghc68.ghc.andrew.cmu.edu, rank 2 out of 6 processors
Hello world from processor ghc68.ghc.andrew.cmu.edu, rank 4 out of 6 processors
Hello world from processor ghc68.ghc.andrew.cmu.edu, rank 5 out of 6 processors
Hello world from processor ghc68.ghc.andrew.cmu.edu, rank 0 out of 6 processors
Hello world from processor ghc68.ghc.andrew.cmu.edu, rank 3 out of 6 processors

You'll note that the output is not in a deterministic order: this is expected because there is no synchronization to enforce a consistent ordering.

Running on multiple machines

If we want to test scaling on our process, we can use mpirun to run it on multiple machines:

areece@ghc68$ mpirun -np 6 --host ghc18,ghc19,ghc20 hello_mpi 
Hello world from processor ghc19.ghc.andrew.cmu.edu, rank 1 out of 6 processors
Hello world from processor ghc19.ghc.andrew.cmu.edu, rank 4 out of 6 processors
Hello world from processor ghc18.ghc.andrew.cmu.edu, rank 0 out of 6 processors
Hello world from processor ghc18.ghc.andrew.cmu.edu, rank 3 out of 6 processors
Hello world from processor ghc20.ghc.andrew.cmu.edu, rank 2 out of 6 processors
Hello world from processor ghc20.ghc.andrew.cmu.edu, rank 5 out of 6 processors

You should note a few things here.

  • The program is run on each of the machines specified (and not necessarily on the machine you are currently running on)
  • mpirun launches 6 threads total across all machines, not 6 thread per machine.
  • mpirun assigns work to machines in a round robin fashion. It's possible to control this assignment if you're interested.

If you are annoyed with specifying the hosts on the command line every time, you can specify a file containing all the hosts:

areece@ghc68$ cat hostfile 
ghc18
ghc19
ghc20
areece@ghc68$ mpirun -np 6 --hostfile hostfile -bind-to-core hello_mpi
Hello world from processor ghc19.ghc.andrew.cmu.edu, rank 1 out of 6 processors
Hello world from processor ghc18.ghc.andrew.cmu.edu, rank 0 out of 6 processors
Hello world from processor ghc19.ghc.andrew.cmu.edu, rank 4 out of 6 processors
Hello world from processor ghc18.ghc.andrew.cmu.edu, rank 3 out of 6 processors
Hello world from processor ghc20.ghc.andrew.cmu.edu, rank 2 out of 6 processors
Hello world from processor ghc20.ghc.andrew.cmu.edu, rank 5 out of 6 processors

Thoughts about using MPI

Some final thoughts:

  • You should be mindful of the fact that there are other people using these machines. You should probably not attempt to use every core on a machine at once: not only is it impolite to use up too much cpu, you will also probably get incorrect data if other people are using the machine. For this reason, we suggest launching no more than half as many processes are there are cores.
  • You really, really, really should look up information about mpirun on your own.

  1. Yes, I know mpicc might be more idiomatic for c programs, but I'm ignoring that for now.