README for MadMPI benchmark

This document describes MadMPI benchmark installation and configuration.

For any question, mailto:

For more information, see:

Quick Start

A quick cheat sheet for the impatient:

mpiexec -n 2 -host host1,host2 ./mpi_bench_overlap | tee out.dat

It runs from 10 minutes through 2h, depanding on network speed. Then build the performance report using:

./mpi_bench_extract out.dat

It outputs data in out.dat.d/. It is possible to transfer data to another host and extract the performance report with another installation of MadMPI benchmark so as to not have to install gnuplot on the computing nodes.

Please send the out.dat to Alexa.nosp@m.ndre.nosp@m..Deni.nosp@m.s@in.nosp@m.ria.f.nosp@m.r to have it integrated on the MadMPI benchmark web site.


  • MPI library
  • autoconf (v 2.50 or later, for svn users)
  • hwloc (optional)
  • gnuplot (optional, v5.0 or later)
  • doxygen (optional, for doc generation)


MadMPI benchmark follows usual autoconf procedure:

./configure [your options here]
make install

The make install step is optional. The benchmark may be run from its build directory.


  • Benchmarks may be run separetely (single benchmark per binary), or as a binary running a full series.
  • For overlap benchmarks, run mpi_bench_overlap on 2 nodes, capture its standard output in a file, and pass this file to mpi_bench_extract. The processed data is outputed to a ${file}.d/ directory containing:
    • raw series for each packet size (files ${bench}-series/${bench}-s${size}.dat)
    • 2D data formated to feed gnuplot pm3d graphs, joined with referece non-overlapped values (files ${bench}-ref.dat)
    • gnuplot scripts (files ${bench}.gp)
    • individual graphs for each benchmark (files ${bench}.png)
    • synthetic graphs (all.png)
  • The benchmarks are:
    • mpi_bench_sendrecv: send/receive pingpong, used as a reference
    • mpi_bench_noncontig: send/receive pingpong with non-contiguous datatype, used as a reference
    • mpi_bench_send_overhead: processor time consumed on sender side to send data (the overhead from LogP). Usefull to explain overlap benchmarks.
    • mpi_bench_overlap_sender: overlap on sender side (i.e. MPI_Isend, computation, MPI_Wait), total time
    • mpi_bench_overlap_recv: overlap on receiver side (i.e. MPI_Irecv, computation, MPI_Wait), total time
    • mpi_bench_overlap_bidir: overlap on both sides
    • mpi_bench_overlap_sender_noncontig: overlap on sender side, with non-contiguous datatype
    • mpi_bench_overlap_send_overhead: overlap on sender side (i.e. MPI_Isend, computation, MPI_Wait), measure time on sender side only
    • mpi_bench_overlap_Nload: overlap on sender side, with multi-threaded computation load