Can someone elaborate on the second point on message passing programs can execute on hardware that does not support system-wide loads and stores?


I'm not 100% sure, but I believe that the second point is talking about the fact that each cpu needs a chunk of the memory to store buffer the message. The hardware shouldn't need to implement a system-wide load and store to execute message passing programs because messages can be local to a single cpu or between two cpus.

I don't think it's saying that the hardware can't support system-wide loads and stores, but it doesn't have to.


MPI defines interfaces to pass messages between different processors or machines, which provides programmers the chance to write parallel code across a cluster. Unlike shared memory method, the cluster does not need to support system-wide load or store.