Systems Research

Robust High Performance Parallel Filesystem

Principal Investigator: P. Wyckoff
Funding Source: Sandia National Labs
Duration: 10/1/2002 - 9/31/2003

Description: First, we plan to do file system work in the context of PVFS, a freely available developed at Clemson University. Why PVFS? Many parallel and distributed filesystem have been developed and discarded over the years, and multiple well-funded research groups are simultaneously working on completely new systems. There are even companies in business to sell production-quality data storage systems, both completed and in the development stage. Most are incompatible with cluster computing because they insist on using expensive components such as fiber channel disks and switch, or large single servers. They also tend to be designed with serious hindrances to scalability, by relying on a single machine to serve data, or lock files. And they insist on providing NFS and CIFS protocols to gain more market share, constraining the implementation. PVFS begins with an assumption that the IO nodes need be fundamentally no more expensive than cluster compute nodes---all the cost is in the disks which can be standard IDE drives. PVFS also has the potential to scale well. It also is not overdesigned for the task of storage for parallel applications: for instance, coherence is maintained by the application itself, not by the filesystem.