- Who can get an account?
- Where should a new OSC user begin?
- Do I have to pay for supercomputer use?
- How many supercomputers does OSC have? Which one should I use?
- How do I cite OSC in my publications?
- How do I submit my publications and funding information to OSC?
- Can I receive a letter of support from OSC when I apply for outside funding?
- How do I register for a workshop?
- Where can I find documentation?
- My question isn't answered here. Whom can I ask for help?
- Something seems to be wrong with the OSC systems. Should I contact the help desk?
- Where can I find logos for my presentations, posters, etc.?
- What are projects and accounts?
- How do I get/renew an account?
- I'm a faculty member. How do I get accounts for my students?
- I'm continuing the research of a student who graduated. Can I use his/her account?
- I'm working closely with another student. Can we share an account?
- How do I change my password?
- I want to use csh instead of bash. How do I change the default shell?
- What is an RU?
- How do I find my account balance?
- How do I get more resources?
- How much will my account be charged for supercomputer usage?
Disk Storage Questions
- What is my disk quota?
- How can I determine the total disk space used by my account?
- How do I get more disk space?
- How can I find my largest directories?
- Why do I receive "no space left" error when writing data to my home directory?
- How can I use tar and gzip to aggregate and compress files?
- Tar is taking too long. Is there a way to compress quicker?
- How do I change the email address OSC uses for me?
- I got an automated email from OSC. Where can I get more information about it?
- What is SSH?
- How does SSH work?
- Can I connect without using an SSH client?
- How can I upload or download files?
- Where can I find SSH and SFTP clients?
- How do I run a graphical application in an ssh session?
- Why do I get "connection refused" when trying to connect to a cluster?
Batch Processing Questions
- What is a batch job?
- How do I submit, check the status, and/or delete a batch job?
- Can I be notified by email when my batch job starts or ends?
- Why won't my job run?
- Why is my job being rejected with the error "Group: is not valid"?
- How can I retrieve files from unexpectedly terminated jobs?
- How can I delete all of my jobs on a cluster?
- How can I determine the number of cores in use by me or my group?
- How to request GPU nodes for visualization?
Compiling System Questions
- What languages are available?
- What compiler (vendor) do you recommend?
- Will software built for one system run on another system?
Parallel Processing Questions
- What is a parallel processing?
- What parallel processing environments are available?
- What is a core?
- I'm not seeing the performance I expected. How can I be sure my code is running in parallel?
- What software applications are available?
- Do you have a newer version of (name your favorite software)?
- How do I get authorized to use a particular software application?
- What math routines are available? Do you have ATLAS and LAPACK?
- Do you have NumPy/SciPy?
- OSC does not have a particular software package I would like to use. How can I request it?
- I have a software package that must be installed as root. What should I do?
- What are modules?
Performance Analysis Questions
- What are MFLOPS/GFLOPS/TFLOPS/PFLOPS?
- How do I find out about my code's performance?
- How can I optimize my code?
Other Common Problems
- What does "CPU time limit exceeded" mean?
- My program or file transfer died for no reason after 20 minutes. What happened?
- Why did my program die with a segmentation fault, address error, or signal 11?
- I created a batch script in a text editor on a Windows system, but when I submit it on an OSC system, almost every line in the script gives an error. Why is that?
- I copied my output file to a Windows system, but it doesn't display correctly. How can I fix it?
- What IP ranges do I need to allow in my firewall to use OSC services?
Any faculty member or research scientist at an academic institution in Ohio is eligible for an academic account at OSC. These researchers/educators may request accounts for their students and collaborators. Commercial accounts are also available. More information about applying for both academic and commercial accounts at OSC can be found at https://www.osc.edu/supercomputing/support/account.
The first thing you should do as a new OSC user is to check the email account that was given to OSC when you were registered. If you are a PI, this will be the email address you entered in the primary contact information on your Project Request application. When your account is created, you will receive an email giving you information on how to start using the services at OSC. After carefully reading the email, please first go to my.osc.edu to reset your password.
Once you are able to connect to our HPC systems, you should start familiarizing yourself with the software and services available from the OSC, including:
OSC receives funding from the state of Ohio, there is no charge for academic use up to 10,000 RUs annually. For information on the academic fee structure, click here.
OSC currently has three HPC clusters: Pitzer Cluster, a 10,560 core Dell cluster with Intel Xeon proccessors, Owens Cluster, a 23,500+ core Dell cluster with Intel Xeon processors, and Ruby Cluster, a 4800 core Intel Xeon machine from HP. New users have access to Pitzer and Owens clusters. Ruby is currently unavailable for general access. To learn more, click here.
Any publication of any material, whether copyrighted or not, based on or developed with OSC services, should cite the use of OSC, and the use of the specific services (where applicable). For more information about citing OSC, please visit www.osc.edu/citation.
Please submit a list of publications, noting which cite OSC, as well as a list of current funding with your application materials for additional resources. Alternatively, please email the lists to OSC Help. You can also add this information in my.osc.edu.
OSC has a standard letter of support that you can include (electronically or in hard copy) with a proposal for outside funding. This letter does not replace the application process for time on OSC's systems. To receive the letter of support, please send your request to firstname.lastname@example.org. You should provide the following information: name and address of the person/organization to whom the letter should be addressed; name(s) of the principal investigator(s) and the institution(s); title of the proposal; number of years of proposed project; number of RUs requested per year. Please allow at least two working days to process your request.
Hardware information about the systems is available at http://www.osc.edu/supercomputing/hardware.
Contact the OSC Help Desk. Support is available 24x7x365, but more complicated questions will need to wait for regular business hours (Monday - Friday, 9am - 5pm) to be addressed. More information on the OSC supercomputing help desk can be found on our Support Services page.
Information will be coming soon for guidelines on reporting possible system problems.
Please see our citation webpage.
An eligible principal investigator heads a project. Under a project, authorized users have accounts with credentials that permit users to gain access to the HPC systems. A principal investigator can have more than one project.
For information concerning accounts (i.e., how to apply, who can apply, etc.), see Accounts.
If an eligible principal investigator is new to OSC, he/she can apply for a start-up request and include the contact information for authorized users. Please refer to http://app.osc.edu/cgi-bin/app/startup for the on-line form. If an eligible principal investigator has a current project, he/she can add the user through client protal my.osc.edu. Authorized users do not have to be located in Ohio.
Please have your PI send an email to email@example.com for further discussions.
No. Each person using the OSC systems must have his/her own account. Sharing files is possible, even with separate accounts.
You can change your password through the My OSC portal. Log in at http://my.osc.edu, and click the "change password" button at the bottom left corner of the HPC User Profile box. Please note that your password has certain requirements; these are specified on the "change password" portal. Please wait 3-5 minutes before attempting to log in with the new password. For security purposes, please note that our password change policy requires a password change every 180 days.
You can change your default shell through the My OSC portal. Log in at my.osc.edu, and use the "Unix Shell" drop-down menu in the HPC User Profile box to change your shell. You will need to log off the HPC system and log back on before the change goes into effect. Please note, that it will take about a few minutes for the changes to be applied.
An RU is a resource unit. A resource unit is an aggregate measure of the use of CPU, memory, and file storage. For details on charging algorithms, see Charging.
To see usage and balance information from any system, refer to the OSCusage page.NOTE: Accounting is updated once a day, so the account balance is for the previous day.
To request more resources, the principal investigator must prepare a proposal that will be sent for peer review. After the review process is complete, the Allocations Committee of the Statewide Users Group meets to determine the number of resource units to be awarded. The committee meets on the second Thursday of even-numbered months. PIs should submit their proposals at least three weeks before the meeting date to ensure inclusion on the agenda.
The application form is located at http://app.osc.edu/cgi-bin/app/. A complete application comprises the following items:
- Contact information of new authorized users
- Contact information of at least three recommended reviewers
- Electronic version of PI’s resume
- Publications list (publications, presentations, and articles that cite OSC)
- Justification and performance (quantitative estimation of the number of RUs per run times; the number of runs for the estimated duration of the request; performance reports of the code to be used or explanations about performance; report of any optimization performed on the code
- Proposal text
For details on charging algorithms, see Charging.
Disk Storage Questions
Each user has a quota of 500 gigabytes (GB) of storage and 1,000,000 files. You may also have access to a project directory with a separate quota. See Available File Systems for more information.
Your quota and disk usage are displayed every time you log in. You have limits on both the amount of space you use and the number of files you have. There are separate quotas for your home directory and any project directories you have access to.Note: The quota information displayed at login is updated twice a day, so the information may not reflect the curent usage.
You may display your home directory quota information with
A PI may request project space to be shared by all users on a project. Estimate the amount of disk space that you will need and the duration that you will need it. Send requests to firstname.lastname@example.org.
See the Allocations and Accounts section for more information.
To reveal the directories in your account that are taking up the most disk space you can use the
tailcommands. For example, to display the ten largest directories, change to your home directory and then run the command:
du . | sort -n | tail -n 10
If you receive the error "No space left on device" when you try to write data to your home directory, it indicates the disk is full. First, check your home directory quota. Each user has 500 GB quota of storage and the quota information is shown when you login to our systems. If your disk quota is full, consider reducing your disk space usage. If your disk quota isn't full (usage less than 500GB), it is very likely that your disk is filled up with 'snapshot' files, which are invisible to users and used to track fine-grained changes to your files for recovering lost/deleted files. In this case, please contact OSC Help for further assitance. To avoid this situation in future, consider running jobs that do a lot of disk I/O in the temporary filesystem ($TMPDIR or $PFSDIR) and copy the final output back at the end of the run. See Available File Systems for more information.
gzipcan be used together to produce compressed file archives representing entire directory structures. These allow convenient packaging of entire directory contents. For example, to package a directory structure rooted at
tar -czvf src.tar.gz src/
This archive can then be unpackaged using
tar -xzvf src.tar.gz
where the resulting directory/file structure is identical to what it was initially.
compresscan also be used to create compressed file archives. See the
manpages on these programs for more details.
tarwith the options
zcvfis taking too long you can instead use
pigzin conjunction with tar.
gzipcompression while taking advantage of multiple cores.tar cvf - paths-to-archive | pigz > archive.tgz
pigz defaults to using eight cores, but you can have it use more or less with the -p argument.tar cvf - paths-to-archive | pigz -n 4 > archive.tgzDue to the parallel nature of pigz, if you are using it on a login node you should limit it to using 2 cores. If you would like to use more cores you need to submit either an interactive or batch job to the queue and do the compression from within the job.
pigzdoes not significantly improve decompression time.
See the Knowledge Base.
Linux is an open-source operating system that is similar to UNIX. It is widely used in High Performance Computing.
See the Unix Basics tutorial for more information. There are also many tutorials available on the web.
Secure Shell (SSH) is a program to log into another computer over a network, to execute commands in a remote machine, and to move files from one machine to another. It provides strong authentication and secure communications over insecure channels. SSH provides secure X connections and secure forwarding of arbitrary TCP connections.
SSH works by the exchange and verification of information, using public and private keys, to identify hosts and users. The
ssh-keygencommand creates a directory ~/.ssh and files that contain your authentication information. The public key is stored in ~/.ssh/id_rsa.pub and the private key is stored in ~/.ssh/id_rsa. Share only your public key. Never share your private key. To further protect your private key you should enter a passphrase to encrypt the key when it is stored in the file system. This will prevent people from using it even if they gain access to your files.
One other important file is ~/.ssh/authorized_keys. Append your public keys to the authorized_keys file and keep the same copy of it on each system where you will make ssh connections.
In addition, on Owens the default SSH client config enables hashing of a user’s known_hosts file. So if SSH is used on Owens the remote system’s SSH key is added to ~/.ssh/known_hosts in a hashed format which can’t be unhashed. If the remote server’s SSH key changes, special steps must be taken to remove the SSH key entry:ssh-keygen -R <hostname>
The OSC OnDemand portal allows you to connect to our systems using your web browser, without having to install any software. You get a login shell and also the ability to transfer files.
Most file transfers are done using sftp (SSH File Transfer Protocol) or scp (Secure CoPy). These utilities are usually provided on Linux/UNIX and Mac platforms. Windows users should read the next section, "Where can I find SSH and SFTP clients".
There are many SSH and SFTP clients available, both commercial and free. See Getting Connected for some suggestions.
Graphics are handled using the X11 protocol. You’ll need to run an X display server on your local system and also set your SSH client to forward (or "tunnel") X11 connections. On most UNIX and Linux systems, the X server will probably be running already. On a Mac or Windows system, there are several choices available, both commercial and free. See our guide to Getting Connected for some suggestions.
OSC temporarily blacklists some IP addresses when multiple failed logins occur. If you are connecting from behind a NAT gateway, as is commonly used for public or campus wireless networks, and get a "connection refused" message it is likely that someone recently tried to connect multiple times and failed when connected to the same network you are on. Please contact OSC Help with your public IP address and the cluster you attempted to connect to and we will remove your IP from the blacklist. You can learn your public IP by searching for "what is my IP address" in Google.
Batch Processing Questions
On all OSC systems, batch processing is managed by the Portable Batch System (PBS). PBS batch requests (jobs) are shell scripts that contain the same set of commands that you enter interactively. These requests may also include options for the batch system that provide timing, memory, and processor information. For more information, see our guide to Batch Processing at OSC.
qstatto check the status, and
qdelto delete a batch request. For more information, see our Batch-Related Command Summary.
Yes. See the
-moption in our PBS Directives Summary. If you are submitting a large number of jobs, this may not be a good idea.
There are numerous reasons why a job might not run even though there appear to be processors and/or memory available. These include:
- Your account may be at or near the job count or processor count limit for an individual user.
- Your group/project may be at or near the job count or processor count limit for a group.
- The scheduler may be trying to free enough processors to run a large parallel job.
- Your job may need to run longer than the time left until the start of a scheduled downtime.
- You may have requested a scarce resource or node type, either inadvertently or by design.
See our Scheduling Policies and Limits for more information.
If your account is under mulitple projects, you must explicitly specify which project you want to charge a jobs usage to. This can be done with the
-APBS directive. For example if a user was a part of both projects PAA0999 and PAQ0343 and wanted to charge a job to PAA0999 they would need to add the following line to their script:#PBS -A PAA0999
A batch job that terminates before the script is completed can still copy files from
$TMPDIRto the user's home directory via the
trapcommands do not work in csh and tcsh shell batch scripts). In the batch script, the
trapcommand needs to precede the command causing the TERMination. It could be placed immediately after the PBS header lines. Here is a generic form:
trap "cd $PBS_O_WORKDIR;mkdir $PBS_JOBID;cp -R $TMPDIR/* $PBS_JOBID;exit" TERM
If a command in a batch script is killed for excessive memory usage (see Out-of-Memory (OOM) or Excessive Memory Usage for details) then the trap command may not be executed. However, normal shell scripting can handle this situation: the exit status of a command that may possibly cause an OOM can be checked and appropriate action taken. Here is a Bourne shell example:
bla_bla_big_memory_using_command_that_may_cause_an_OOM if [ $? -ne 0 ]; then cd $PBS_O_WORKDIR;mkdir $PBS_JOBID;cp -R $TMPDIR/* $PBS_JOBID exit fi
Finally, if a node your job is running on crashes then the trap command may not be executed. It may be possible to recover your files from batch-managed directories in this case. Contact OSC Help for assistance.
To delete all your jobs on one of the clusters, including those currently running, queued, and in hold, login to the cluster and run the command:qselect -u <username> | xargs qdel
To determine the number of cores (processors) in use by you account on a particular system run:showq -u $USER | grep "local jobs"
To determine the number of cores (processors) in use by your primary project on a particular system run:showq -w acct=$GROUP | grep "local jobs"
Genrerally, to see the number of cores in use by a particular project on a particular system run:showq -w acct=<project> | grep "local jobs"
<project>with the project ID.
By default, we don't start an X server on gpu nodes because it impacts computational performance. Add
visin your GPU request such that the batch system uses the GPUs for visualization. For example, on Owens, it should benodes=1:ppn=28:gpus=1:vis
Compiling System Questions
Fortran, C, and C++ are available on all OSC systems. The commands used to invoke the compilers and/or loaders vary from system to system. For more information, see our Compilation Guide.
We have Intel, PGI, and gnu compilers available on all systems. Each compiler vendor supports some options that the other doesn’t, so the choice depends on your individual needs. For more information, see our Compilation Guide.
Most serial code built on one system will run on another system, although it may run more efficiently if it is built and run on the same system. Parallel (MPI) code typically must be built on the system where it will run.
Parallel Processing Questions
Parallel processing is the simultaneous use of more than one computer (or processor) to solve a problem. There are many different kinds of parallel computers. They are distinguished by the kind of interconnection between processors or nodes (groups of processors) and between processors and memory.
On most systems, both shared-memory and distributed-memory parallel programming models can be used. Versions of OpenMP (for multithreading or shared-memory usage) and MPI (for message-passing or distributed-memory usage) are available. A summary of parallel environments will be coming soon.
A core is a processor. When a single chip contains multiple processors, they are called cores.
We are currently working on a guide for this. Please contact OSC Help for assistance.
See the Software section for more information.
Check the Software section to see what versions are installed. You can also check the installed modules using the
module spideror module avail command.
Please contact OSC Help for assistance.
See the Software section for information on third-party math libraries (e.g., MKL, ACML, fftw, scalapack, etc). MKL and ACML are highly optimized libraries that include the BLAS and LAPACK plus some other math routines.
The NumPy and SciPy modules are installed with the python software. See the Python software page.
Please refer to the Software Forms page. You will see a link to Request for Software Form. Download the form, complete the information, and attach the form to an e-mail to email@example.com. The Statewide Users Group will consider the request.
You may install open source software yourself in your home directory. If you have your own license for commercial software, contact the OSC Help desk.
Most packages have a (poorly documented) option to install under a normal user account. Contact the OSC Help desk if you need assistance. We generally do not install user software as root.
Modules are used to manage the environment variable settings associated with software packages in a shell-independent way. On OSC's systems, you will by default have modules in your environment for PBS, MPI, compilers, and a few other pieces of software. For information on using the module system, see our guide to Batch Processing at OSC.
Performance Analysis Questions
MegaFLOPS/GigaFLOPS/TeraFLOPS/PetaFLOPS are millions/billions/trillions/quadrillions of FLoating-point Operations (calculations) Per Second.
A number of performance analysis tools are available on OSC systems. Some are general to all systems and others are specific to a particular system. See our performance analysis guide for more info.
There are several ways to optimize code. Key areas to consider are CPU optimization, I/O optimization, memory optimization, and parallel optimization. See our optimization strategy guide for more info.
Other Common Problems
Programs run on the login nodes are subject to strict CPU time limits. To run an application that takes more time, you need to create a batch request. Your batch request should include an appropriate estimate for the amount of time that your application will need. See our guide to Batch Processing at OSC for more information.
Programs run on the login nodes are subject to strict CPU time limits. Because file transfers use encryption, you may hit this limit when transferring a large file. To run longer programs, use the batch system. To transfer larger files, connect to sftp.osc.edu instead of to a login node.
This is most commonly caused by trying to access an array beyond its bounds -- for example, trying to access element 15 of an array with only 10 elements. Unallocated arrays and invalid pointers are other causes. You may wish to debug your program using one of the available tools such as the TotalView Debugger.
I created a batch script in a text editor on a Windows or Mac system, but when I submit it on an OSC system, almost every line in the script gives an error. Why is that?
Windows and Mac have different end-of-line conventions for text files than UNIX and Linux systems do, and most UNIX shells (including the ones interpreting your batch script) don't like seeing the extra character that Windows appends to each line or the alternate character used by Mac. You can use the following commands on the Linux system to convert a text file from Windows or Mac format to UNIX format:
A text file created on Linux/UNIX will usually display correctly in Wordpad but not in Notepad. You can use the following command on the Linux system to convert a text file from UNIX format to Windows format:
See our knowledge base article on the topic.