Overview of File Systems

OSC has several different file systems where you can create files and directories. The characteristics of those systems and the policies associated with them determine their suitability for any particular purpose. This section describes the characteristics and policies that you should take into consideration in selecting a file system to use.

The various file systems are described in subsequent sections.

Visibility

Most of our file systems are shared. Directories and files on the shared file systems are accessible from all OSC HPC systems. By contrast, local storage is visible only on the node it is located on. Each compute node has a local disk with scratch file space.

Permanence

Some of our storage environments are intended for long-term storage; files are never deleted by the system or OSC staff. Some are intended as scratch space, with files deleted as soon as the associated job exits. Others fall somewhere in between, with expected data lifetimes of a few months to a couple of years.

Backup policies

Some of the file systems are backed up to tape; some are considered temporary storage and are not backed up. Backup schedules differ for different systems.

In no case do we make an absolute guarantee about our ability to recover data. Please read the official OSC data management policies for details. That said, we have never lost backed-up data and have rarely had an accidental loss of non-backed-up data.

Size/Quota

The permanent (backed-up) and scratch file systems all have quotas limiting the amount of file space and the number of files that each user or group can use. Your usage and quota information are displayed every time you log in to one of our HPC systems. You can also check your home directory quota using the quota command. We encourage you to pay attention to these numbers because your file operations, and probably your compute jobs, will fail if you exceed them. If you have extremely large files, you will have to pay attention to the amount of local file space available on different compute nodes.

Performance

File systems have different performance characteristics including read/write speeds and behavior under heavy load. Performance matters a lot if you have I/O-intensive jobs. Choosing the right file system can have a significant impact on the speed and efficiency of your computations. You should never do heavy I/O in your home or project directories, for example.

Table overview

Each file system is configured differently to serve a different purpose:

  Home Directory Project Local Disk Scratch (global) Backup
Path /users/project/userID

/fs/ess

/tmp

/fs/scratch

N/A
Environment Variable $HOME or ~ N/A $TMPDIR $PFSDIR N/A
Space Purpose Permanent storage Long-term storage Temporary Temporary Backup; replicated in Cleveland

 

Backed Up? Daily Daily No No Yes
Flushed No No End of job when $TMPDIR    is used End of job when $PFSDIR is used No
Visibility Login and compute nodes Login and compute nodes Compute node Login and compute nodes N/A
Quota/Allocation 500  GB  of storage and 1,000,000 files Typically 1-5  TB  of storage and 100,000 files per TB. Varies. Depending on node 100  TB  of storage and 25,000,000 files N/A
Total Size 1.9  PB 

/fs/ess: 13.5 PB 

Varies. Depending on system

/fs/scratch: 3.5 PB 

Bandwidth 40  GB/S

Reads: 60 GB/S Writes: 50 GB/S

Varies. Depending on system Reads: 170 GB/S

Writes: 70 GB/S

N/A
Type NetApp  WAFL  service GPFS Varies. Depending on system GPFS  
Supercomputer: 
Service: