Systems Research

Collective communication and MPICH2 improvements for MVAPICH MPI on InfiniBand

Principal Investigators: D.K. Panda and P. Wyckoff
Funding Source: System Fabric Works
Duration: 6/16/03--9/30/03

Description: The emerging InfiniBand architecture is providing a new way to design next generation high performance clusters. In addition to the higher network speed, this architecture provides several new mechanisms (RDMA, multicast, atomic, and service levels) to build high performance and efficient communication subsystems for clusters. At OSU, a high performance MPI implementation (MVAPICH) with a focus toward point-to-point communication has been recently designed and developed to take advantage of InfiniBand for next generation clusters. Large-scale parallel systems require efficient support for collective communication (broadcast/multicast, barrier, reduction, etc.). Modern systems also can benefit from the many performance advantages offered by the emerging MPICH2 work from Argonne National Laboratories. In this work, we design of scalable and high performance collective communication schemes for InfiniBand-based clusters, and transition of MPI over InfiniBand implementation to MPICH2.