GridFTP driver for ROMIO

I've developed an ADIO driver for ROMIO which makes it possible to access GridFTP URLs (eg. gsiftp://oscbw.osc.edu/home/troy/foo) using the MPI-2 I/O API. This makes doing MPI parallel I/O in distributed scenarios (eg. parallel programs running at multiple sites using MPICH/G2) much more straightforward than it would be otherwise. The usual Globus-based PKI authentication layers beneath GridFTP are used automatically.

Publications

I've written a paper on this work which has been accepted to Cluster 2004. The slides of my presentation are also available.

Source code patch

This patch (relative to the ROMIO in MPICH2 0.96p2) or this one (relative to ROMIO 1.2.6) adds GridFTP functionality to ROMIO. It should apply cleanly to any recent MPICH2 source tree, as well as MPICH source trees after 1.2.6.

Limitations

  1. Writes don't work through NAT firewalls unless you do tricks to statically map inbound ports to specific nodes. This is a limitation of the GridFTP protocol, not the ADIO driver.
  2. Shared file pointers don't work. GridFTP is not alone in this regard; about half of the ADIO drivers in ROMIO (including PVFS and PVFS2) don't support shared file pointers.
  3. Atomic operations (eg. MPI_File_set_atomic()) and sync operations (eg. MPI_File_sync()) don't work. The underlying GridFTP operations are about as atomic and synchronous as they are likely to get anyway. However, the lack of atomic ops breaks parallel HDF5.