HDF5

HDF5 is a general purpose library and file format for storing scientific data. HDF5 can store two primary objects: datasets and groups. A dataset is essentially a multidimensional array of data elements, and a group is a structure for organizing objects in an HDF5 file. Using these two basic objects, one can create and store almost any kind of scientific data structure, such as images, arrays of vectors, and structured and unstructured grids.

Availability and Restrictions

Versions

HDF5 is available on the Pitzer and Owens Clusters. The versions currently available at OSC are:

Version Owens Pitzer Ascend
1.8.17  X    
1.8.19 X    
1.10.2 X X  
1.10.4 X X  
1.10.8     X
1.12.0 X* X*  
1.12.2 X X X
* Current Default Version

You can use module spider hdf5 to view available modules for a given machine. Feel free to contact OSC Help if you need other versions for your work.

Access

HDF5 is available to all OSC users. If you have any questions, please contact OSC Help.

Publisher/Vendor/Repository and License Type

The HDF Group, Open source (academic)

API Compatibility issue on hdf5/1.12

hdf5/1.12 may not compatible with applications created with earlier hdf5 versions. In order to work around, users may use a compatibility macro mapping:

  • To compile an application built with a version of HDF5 that includes deprecated symbols (the default), specify: -DH5_USE_110_API (autotools) or –DH5_USE_110_API:BOOL=ON (CMake)

However, users will not be able to take advantage of some of the new features in 1.12 if using these compatibility mappings. For more detail, please see release note.

Usage

Usage on Owens

Set-up

Initalizing the system for use of the HDF5 library is dependent on the system you are using and the compiler you are using. To load the default HDF5 library, run the following command: module load hdf5. To load a particular version, use module load hdf5/version. For example, use module load hdf5/1.8.17 to load HDF5 version 1.8.17. You can use module spider hdf5 to view available modules.

Building With HDF5

The HDF5 library provides the following variables for use at build time:

Variable Use
$HDF5_C_INCLUDE Use during your compilation step for C programs
$HDF5_CPP_INCLUDE Use during your compilation step for C++ programs (serial version only)
$HDF5_F90_INCLUDE Use during your compilation step for FORTRAN programs
$HDF5_C_LIBS Use during your linking step programs
$HDF5_F90_LIBS

Use during your linking step for FORTRAN programs

For example, to build the code myprog.c or myprog.f90 with the hdf5 library you would use:

icc -c $HDF5_C_INCLUDE myprog.c
icc -o myprog myprog.o $HDF5_C_LIBS
ifort -c $HDF5_F90_INCLUDE myprog.f90
ifort -o myprog myprog.o $HDF5_F90_LIBS

Batch Usage

When you log into owens.osc.edu you are actually logged into a linux box referred to as the login node. To gain access to the mutiple processors in the computing environment, you must submit your job to the batch system for execution. Batch jobs can request mutiple nodes/cores and compute time up to the limits of the OSC systems. Refer to Queues and Reservations and Batch Limit Rules for more info. 

Non-interactive Batch Job (Serial Run)
A batch script can be created and submitted for a serial or parallel run. You can create the batch script using any text editor you like in a working directory on the system of your choice. Below is the example batch script that executes a program built with the HDF5 library:
#!/bin/bash
#SBATCH --job-name=AppNameJob
#SBATCH --nodes=1 --ntasks-per-node=28
#SBATCH --account <project-account>

module load hdf5
cp foo.dat $TMPDIR
cd $TMPDIR
appname
cp foo_out.h5 $SLURM_SUBMIT_DIR

Usage on Pitzer

Set-up

Initalizing the system for use of the HDF5 library is dependent on the system you are using and the compiler you are using. To load the default HDF5 library, run the following command: module load hdf5

Building With HDF5

The HDF5 library provides the following variables for use at build time:

VARIABLE USE
$HDF5_C_INCLUDE Use during your compilation step for C programs
$HDF5_CPP_INCLUDE Use during your compilation step for C++ programs (serial version only)
$HDF5_F90_INCLUDE Use during your compilation step for FORTRAN programs
$HDF5_C_LIBS Use during your linking step programs
$HDF5_F90_LIBS

Use during your linking step for FORTRAN programs

For example, to build the code myprog.c or myprog.f90 with the hdf5 library you would use:

icc -c $HDF5_C_INCLUDE myprog.c
icc -o myprog myprog.o $HDF5_C_LIBS
ifort -c $HDF5_F90_INCLUDE myprog.f90
ifort -o myprog myprog.o $HDF5_F90_LIBS

Batch Usage

When you log into owens.osc.edu you are actually logged into a linux box referred to as the login node. To gain access to the mutiple processors in the computing environment, you must submit your job to the batch system for execution. Batch jobs can request mutiple nodes/cores and compute time up to the limits of the OSC systems. Refer to Queues and Reservations and Batch Limit Rules for more info. 

Non-interactive Batch Job (Serial Run)
batch script can be created and submitted for a serial or parallel run. You can create the batch script using any text editor you like in a working directory on the system of your choice. Below is the example batch script that executes a program built with the HDF5 library:
#!/bin/bash
#SBATCH --job-name=AppNameJob 
#SBATCH --nodes=1 --ntasks-per-node=48
#SBATCH --account <project-account>

module load hdf5
cp foo.dat $TMPDIR
cd $TMPDIR
appname
cp foo_out.h5 $SLURM_SUBMIT_DIR

Further Reading

Tag: 
Supercomputer: 
Service: