Darshan

Darshan is a lightweight "scalable HPC I/O characterization tool".  It is intended to profile I/O by emitting log files to a consistent log location for systems administrators, and also provides scripts to create summary PDFs to characterize I/O in MPI-based programs.

Availability and Restrictions

Versions

The following versions of Darshan are available on OSC clusters:

Version Owens Pitzer
3.1.2 X  
3.1.4 X  
3.1.5-pre1 X  
3.1.5 X  
3.1.6 X X
3.1.8 X* X*
3.2.1 X X
* Current default version

You can use module spider darshan to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work.

Access 

Darshan is available to all OSC users. If you have any questions, please contact OSC Help.

Publisher/Vendor/Repository and License Type

MCSD, Argonne National Laboratory, Open source

Usage

Usage on Owens & Pitzer

Setup

To configure the Owens/Pitzer cluster for Darshan run module spider darshan/VERSION to find supported compiler and MPI implementations, e.g.

$ module spider darshan/3.2.1

------------------------------------------------------------------------------------------------
  darshan: darshan/3.2.1
------------------------------------------------------------------------------------------------

    You will need to load all module(s) on any one of the lines below before the "darshan/3.2.1" module is available to load.

      intel/19.0.3  intelmpi/2019.7
      intel/19.0.3  mvapich2/2.3.1
      intel/19.0.3  mvapich2/2.3.2
      intel/19.0.3  mvapich2/2.3.3
      intel/19.0.3  mvapich2/2.3.4
      intel/19.0.3  mvapich2/2.3.5
      intel/19.0.5  intelmpi/2019.3
      intel/19.0.5  intelmpi/2019.7
      intel/19.0.5  mvapich2/2.3.1
      intel/19.0.5  mvapich2/2.3.2
      intel/19.0.5  mvapich2/2.3.3
      intel/19.0.5  mvapich2/2.3.4
      intel/19.0.5  mvapich2/2.3.5

then switch to the favorite programming environment and load the Darshan module:

$ module load intel/19.0.5 mvapich2/2.3.5
$ module load darshan/3.2.1

Batch Usage

Batch jobs can request mutiple nodes/cores and compute time up to the limits of the OSC systems. Refer to Queues and Reservations (Owens, Pitzer) and Scheduling Policies and Limits for more info. 

If you have an MPI-based program the syntax is as simple as

module load darshan

# basic call to darshan
export MV2_USE_SHARED_MEM=0
export LD_PRELOAD=$OSC_DARSHAN_DIR/lib/libdarshan.so
srun [args] ./my_mpi_program

# to show evidence that Darshan is working and to see internal timing
export DARSHAN_INTERNAL_TIMING=yes
srun [args] ./my_mpi_program
An Example of Using Darshan with MPI-IO

Below is an example batch script (darshan_mpi_pfsdir_test.sh) for testing MPI-IO and POSIX-IO.  Because the files generated here are large scratch files there is no need to retain them.

#!/bin/bash
#SBATCH --job-name="darshan_mpi_pfsdir_test"
#SBATCH --ntasks=4
#SBATCH --ntasks-per-node=2
#SBATCH --output=rfm_darshan_mpi_pfsdir_test.out
#SBATCH --time=0:10:0
#SBATCH -p parallel
#SBATCH --gres=pfsdir:ess

# Setup Darshan
module load intel
module load mvapich2
module load darshan
export DARSHAN_LOGFILE=${LMOD_SYSTEM_NAME}_${SLURM_JOB_ID/.*/}_${SLURM_JOB_NAME}.log
export DARSHAN_INTERNAL_TIMING=yes
export MV2_USE_SHARED_MEM=0
export LD_PRELOAD=$OSC_DARSHAN_DIR/lib/libdarshan.so

# Prepare the scratch files and run the cases
cp ~support/share/reframe/source/darshan/io-sample.c .
mpicc -o io-sample io-sample.c -lm
for x in 0 1 2 3; do  dd if=/dev/zero of=$PFSDIR/read_only.$x bs=2097152000 count=1; done
shopt -s expand_aliases
srun ./io-sample -p $PFSDIR -b 524288000 -v

# Generat report
darshan-job-summary.pl --summary $DARSHAN_LOGFILE

In order to run it via the batch system, submit the darshan_mpi_pfsdir_test.sh file with the following command:

sbatch darshan_mpi_pfsdir_test.sh

Further Reading

Supercomputer: 
Service: 
Technologies: 
Fields of Science: