Overview

DIRSIG5 supports the automated processing of single/multi-frame simulations across multiple processes/nodes using the OpenMPI implementation of the Message Passing Interface (MPI). Separate builds for MPI and non-MPI environments are available from myDIRSIG. Non-MPI builds use conventional file I/O to write output files, while the MPI builds leverage MPI-IO. It is recommended that end-users running on a single workstation with a single CPU socket use non-MPI builds to reduce the number of runtime dependencies. Workstations featuring more than one CPU socket can gain a significant runtime performance benefit by using the MPI build with a proper Process Topology.

Dependencies

OpenMPI runtime libraries are not distributed with DIRSIG5. It was built with OpenMPI v1.10.7, the default version available in Redhat Enterprise Linux (RHEL) 7 OpenMPI versions of the 1.10.x range and newer likely work as well, but have not been tested.

Process Topology

Due to NUMA considerations, it is recommended that one runs one bound process per CPU socket. Each process should then spawn a number of threads equal to the number of CPU cores available on said socket. In general, it is recommended that one disable simultaneous multi-threading (SMT, e.g. Hyper-Threading) to better control NUMA characteristics, but the performance impact of this varies between system configurations and may not be optimal in all situations.

MPI Modes

DIRSIG supports two scheduling modes:

All Processes per-Frame (APF)

This is the default behavior. APF distributes the processing of each frame (capture) in a simulation across all processes. It is used to maximize complete frame throughput.

Unique Frame per-Process (UFP)

UFP assigns one unique frame (capture) to each process. Under certain circumstances it can achieve a lower overall simulation runtime than APF for multi-frame simulations (e.g. poor filesystem/MPI-IO) implementation support for writing simultaneously to the same file and/or if the scene changes significantly between frames). It only is available for multi-frame simulations with a platform truth/image schedule of capture (i.e. one image and/or truth file per frame).

Note Currently, DIRSIG doesn’t transition from UFP to AFP scheduling mode for remainder frames, so efficiency peaks when the number of frames is evenly divisible by the number of processes.

Quick Start

A user can easily load the OpenMPI utilities/libraries into their shell environment using environment modules, the MPI environment environment module is installed with the openmpi package in RHEL-like Linux distributions when deployed via yum.

Load the OpenMPI environment

Load the environment module for OpenMPI into the user’s shell environment:

$ module load mpi/openmpi-x86_64

Run DIRSIG with 1 process per socket (default APF schedule)

Use the --npersocket option to mpirun to have OpenMPI run a single process per socket:

$ mpirun --npersocket 1 dirsig5 MY_SIM.sim

Same as above, but with UFP schedule

Use the DIRSIG5 --mpi_one_event_per_node option to switch the internal scheduler into the unique frame per-process (UFP) mode:

$ mpirun --npersocket 1 dirsig5 --mpi_one_event_per_node MY_SIM.sim