High-performance Storage Undergraduate Research Opportunity =========================================================== Researchers at the Ohio Supercomputer Center, an independent organization in the Office of Research, are seeking up to two undergraduate students to investigate the use of object-based devices in parallel file systems. The students will be supported under an NSF grant (#0621484). Work will be performed on west campus, at 1224 Kinnear Road, conveniently on the North Express and North and South Campus Loop bus lines. Access to very large computational and storage facilities will be available by working at OSC. Also participating on the project are three OSC staff members and one graduate student. An ideal student would have an interest in systems, storage, protocols and networking; extensive experience in C programming; familiarity with Linux, file systems, and parallel computing; but most importantly, a drive to be intellectually challenged, to learn about cutting-edge storage environments and to carry out novel research in the field. Beyond performing the research itself, staff and students will be encouraged to present their results in the form of conference presentations and journal papers. Minimum job requirements: - Enrolled as a full-time student at OSU - Have completed at least one full year of classes - Declared major in science or engineering field - Willing to work 10-20 hours per week - Knowledge of basic Unix shell commands: cd, ls, etc. - Competent in C language programming Potential projects include: - Adapt SQlite library to simplify locking for single-application environments; evaluate performance of modified library in an OSD target. - Prepare three open source code packages for distribution; maintain mailing lists and web pages; serve as first line of support for external queries. - Extend OSD protocols with new commands for scatter/gather data transfer and atomic operation support. and others, depending on your interest. Technical information on OSD is available at http://www.t10.org/drafts.htm#osd2 . Information on PVFS, the parallel file system used in this work is available at http://www.pvfs.org/ . See http://www.osc.edu/~pw/papers/ for some abstracts and papers of related work. For more information, please contact Pete Wyckoff by email at pw@osc.edu or telephone 614 247 7956. To visit OSC, see: http://www.osc.edu/about/visitOSC.shtml http://www.osu.edu/map/building.php?area=ucomm&building=374 and CABS maps for North Express, North and South Campus Loops, to arrive at the "Southwest Lot" stop. Project Abstract ---------------- While continued improvements in processing speeds and disk densities improve computing over time, the most fundamental advances come from changing the ways in which components interact. Delegating responsibility for some operations from the host processor to intelligent peripherals can improve application performance. Traditional storage technology is based on simple fixed-size accesses with little assistance from disk drives, but an emerging standard for object-based storage devices (OSDs) is being adopted. These devices will offer improvements in performance, scalability and management, and are expected to be available as commodity items soon. When assembled as a parallel file system, for use in high-performance computing, object-based storage devices offer the potential to improve scalability and throughput by permitting clients to securely and directly access storage. However, while the feature set offered by OSD is richer than that of traditional block-based devices, it does not provide all the functionality needed by a parallel file system. We will examine multiple aspects of the mismatch between the needs of a parallel file system, in particular PVFS2, and the capabilities of OSD. Topic areas include mapping data to objects, metadata, transport, caching and reliability. Trade-offs arise from the mapping of files to objects, and how to stripe files across multiple objects and disks, in order to obtain good performance. A distributed file system needs to track metadata that describes and connects data. OSDs offer automatic management of some critical metadata components that can be used by the file system. There are transport issues related to flow control and multicast operations that must be solved. Implementing client caching schemes and maintaining data consistency also requires proper application of OSD capabilities. Our work will examine the feasibility of OSDs for use in parallel file systems, discovering techniques to accommodate this high performance usage model. We will also suggest extensions to the current OSD standard as needed. About OSC --------- OSC is Ohio's high performance computing, networking, and research center. Established in 1987 by the Ohio Board of Regents, the Center provides scientific computing, networking, educational outreach, and information technology resources to state and national high performance computing and networking groups. OSC empowers its academic, industrial, and government partners to make Ohio the education and technology state of the future. More information about OSC can be found at www.osc.edu.