Probe seems to consume the CPU

Yes; most MPI implementations, for the sake of performance, busy-wait on blocking operations. The assumption is that the MPI job is the only thing going on that we care about on the processor, and if the task is blocked waiting for communications, the best thing to do is to continually poll for that communication to reduce latency; so that there’s virtually no delay between when the message arrives and when it’s handed off to the MPI task. This typically means that CPU is pegged at 100% even when nothing “real” is being done.

That’s probably the best default behaviour for most MPI users, but it isn’t always what you want. Typically MPI implementations allow turning this off; with OpenMPI, you can turn this behaviour off with an MCA parameter,

mpirun -np N --mca mpi_yield_when_idle 1 ./a.out

Leave a Comment