- Who can get an account?
- Where should a new OSC user begin?
- Do I have to pay for supercomputer use?
- How many supercomputers does OSC have? Which one should I use?
- How do I cite OSC in my publications?
- How do I submit my publications and funding information to OSC?
- Can I receive a letter of support from OSC when I apply for outside funding?
- How do I register for a workshop?
- Where can I find documentation?
- My question isn't answered here. Whom can I ask for help?
- Something seems to be wrong with the OSC systems. Should I contact the help desk?
- Where can I find logos for my presentations, posters, etc.?
- What are projects and accounts?
- How do I get/renew an account?
- I'm a faculty member. How do I get accounts for my students?
- I'm continuing the research of a student who graduated. Can I use his/her account?
- I'm working closely with another student. Can we share an account?
- How do I change my password?
- I want to use csh instead of bash. How do I change the default shell?
- How do I find my account balance?
- How do I get more resources?
- How much will my account be charged for supercomputer usage?
Disk Storage Questions
- What is my disk quota?
- How can I determine the total disk space used by my account?
- How do I get more disk space?
- How can I find my largest directories?
- Why do I receive "no space left" error when writing data to my home directory?
- How can I use tar and gzip to aggregate and compress files?
- Tar is taking too long. Is there a way to compress quicker?
- How do I change the email address OSC uses for me?
- I got an automated email from OSC. Where can I get more information about it?
- What is SSH?
- How does SSH work?
- Can I connect without using an SSH client?
- How can I upload or download files?
- Where can I find SSH and SFTP clients?
- How do I run a graphical application in an ssh session?
- Why do I get "connection refused" when trying to connect to a cluster?
Batch Processing Questions
- What is a batch job?
- How do I submit, check the status, and/or delete a batch job?
- Can I be notified by email when my batch job starts or ends?
- Why won't my job run?
- Why do I get the error 'You have not specified an account and have more than one available.'?
- How can I retrieve files from unexpectedly terminated jobs?
- How can I delete all of my jobs on a cluster?
- How can I determine the number of cores in use by me or my group?
- How to request GPU nodes for visualization?
Compiling System Questions
- What languages are available?
- What compiler (vendor) do you recommend?
- Will software built for one system run on another system?
- What is the difference between building software on a local computer and on an OSC cluster?
- What is this build error: "... relocation truncated to fit ..."?
Parallel Processing Questions
- What is a parallel processing?
- What parallel processing environments are available?
- What is a core?
- I'm not seeing the performance I expected. How can I be sure my code is running in parallel?
- What software applications are available?
- Do you have a newer version of (name your favorite software)?
- How do I get authorized to use a particular software application?
- What math routines are available? Do you have ATLAS and LAPACK?
- Do you have NumPy/SciPy?
- OSC does not have a particular software package I would like to use. How can I request it?
- I have a software package that must be installed as root. What should I do?
- What are modules?
Performance Analysis Questions
- What are MFLOPS/GFLOPS/TFLOPS/PFLOPS?
- How do I find out about my code's performance?
- How can I optimize my code?
Other Common Problems
- What does "CPU time limit exceeded" mean?
- My program or file transfer died for no reason after 20 minutes. What happened?
- Why did my program die with a segmentation fault, address error, or signal 11?
- I created a batch script in a text editor on a Windows system, but when I submit it on an OSC system, almost every line in the script gives an error. Why is that?
- I copied my output file to a Windows system, but it doesn't display correctly. How can I fix it?
- What IP ranges do I need to allow in my firewall to use OSC services?
Anyone can have an account with OSC, but you need access to a project to utilize our resources. If an eligible principal investigator has a current project, he/she can add the user through client protal MyOSC. Authorized users do not have to be located in Ohio or at the same institution.
See our webpage for more information: https://www.osc.edu/supercomputing/support/account
Once you are able to connect to our HPC systems, you should start familiarizing yourself with the software and services available from the OSC, including:
It depends on the type of client and your rate of consumption. Please click here for more information.
OSC currently has three HPC clusters: Pitzer Cluster, a 29,664 core Dell cluster with Intel Xeon proccessors, Owens Cluster, a 23,500+ core Dell cluster with Intel Xeon processors, and Ascend Cluster with 2,304 core Dell cluster devoted to intensive GPU processing. New users have access to Pitzer and Owens clusters. To learn more,click here.
Any publication of any material, whether copyrighted or not, based on or developed with OSC services, should cite the use of OSC, and the use of the specific services (where applicable). For more information about citing OSC, please visit www.osc.edu/citation.
You can add these to your profile in MyOSC. You can then associate them with OSC project(s).
See our website for more information: https://www.osc.edu/supercomputing/portals/client_portal/manage_profile_information
OSC has a standard letter of support that you can include (electronically or in hard copy) with a proposal for outside funding. This letter does not replace the budget process. To receive the letter of support, please send your request to email@example.com. You should provide the following information: name and address of the person/organization to whom the letter should be addressed; name(s) of the principal investigator(s) and the institution(s); title of the proposal; number of years of proposed project; budget requested per year. Please allow at least two working days to process your request.
Hardware information about the systems is available at http://www.osc.edu/supercomputing/hardware
Information will be coming soon for guidelines on reporting possible system problems.
Please see our citation webpage.
An eligible principal investigator heads a project. Under a project, authorized users have accounts with credentials that permit users to gain access to the HPC systems. A principal investigator can have more than one project.
For information concerning accounts (i.e., how to apply, who can apply, etc.), see Accounts.
If an eligible principal investigator is new to OSC, he/she can create a new project. If an eligible principal investigator has a current project, he/she can add the user through client protal MyOSC. Authorized users do not have to be located in Ohio or at the same institution.
Please have your PI send an email to firstname.lastname@example.org for further discussions.
No. Each person using the OSC systems must have his/her own account. Sharing files is possible, even with separate accounts.
You can change your password through the MyOSC portal. Log in at MyOSC, and click your name in the upper right hand corner to secure a dropdown menu. Select the "change password" item. Please note that your password has certain requirements; these are specified on the "change password" portal. You may need to wait up to 20 minutes to be able to login with the new password. For security purposes, please note that our password change policy requires a password change every 180 days.
If your password has expired, you can update by following the "Forgot your password?" link at MyOSC login page.
You can change your default shell through the MyOSC portal. Log in at MyOSC, and use the "Unix Shell" drop-down menu in the HPC User Profile box to change your shell. You will need to log off the HPC system and log back on before the change goes into effect. Please note, that it will take about a few minutes for the changes to be applied.
To see usage and balance information from any system, refer to the OSCusage page.NOTE: Accounting is updated once a day, so the account balance is for the previous day.
To request additional use of our resources, the principal investigator will need to change the budget for their project. Please see the creating budgets and projects page.
If the project is associated with an Ohio academic institution, see the academic fee structure page for pricing.
If the project is NOT associated with an Ohio academic institution, contact OSC Sales for information on pricing.
See Job and storage charging for how OSC calculates charges.
Disk Storage Questions
Each user has a quota of 500 gigabytes (GB) of storage and 1,000,000 files. You may also have access to a project directory with a separate quota. See Available File Systems for more information.
Your quota and disk usage are displayed every time you log in. You have limits on both the amount of space you use and the number of files you have. There are separate quotas for your home directory and any project directories you have access to.Note: The quota information displayed at login is updated twice a day, so the information may not reflect the curent usage.
You may display your home directory quota information with
A PI may request project space to be shared by all users on a project. Estimate the amount of disk space that you will need and the duration that you will need it. Send requests to email@example.com.
To reveal the directories in your account that are taking up the most disk space you can use the
tailcommands. For example, to display the ten largest directories, change to your home directory and then run the command:
du . | sort -n | tail -n 10
If you receive the error "No space left on device" when you try to write data to your home directory, it indicates the disk is full. First, check your home directory quota. Each user has 500 GB quota of storage and the quota information is shown when you login to our systems. If your disk quota is full, consider reducing your disk space usage. If your disk quota isn't full (usage less than 500GB), it is very likely that your disk is filled up with 'snapshot' files, which are invisible to users and used to track fine-grained changes to your files for recovering lost/deleted files. In this case, please contact OSC Help for further assistance. To avoid this situation in future, consider running jobs that do a lot of disk I/O in the temporary filesystem ($TMPDIR or $PFSDIR) and copy the final output back at the end of the run. See Available File Systemsfor more information.
gzipcan be used together to produce compressed file archives representing entire directory structures. These allow convenient packaging of entire directory contents. For example, to package a directory structure rooted at
tar -czvf src.tar.gz src/
This archive can then be unpackaged using
tar -xzvf src.tar.gz
where the resulting directory/file structure is identical to what it was initially.
compresscan also be used to create compressed file archives. See the
manpages on these programs for more details.
tarwith the options
zcvfis taking too long you can instead use
pigzin conjunction with tar.
gzipcompression while taking advantage of multiple cores.tar cvf - paths-to-archive | pigz > archive.tgz
pigz defaults to using eight cores, but you can have it use more or less with the -p argument.tar cvf - paths-to-archive | pigz -n 4 > archive.tgzDue to the parallel nature of pigz, if you are using it on a login node you should limit it to using 2 cores. If you would like to use more cores you need to submit either an interactive or batch job to the queue and do the compression from within the job.
pigzdoes not significantly improve decompression time.
See the Knowledge Base.
Linux is an open-source operating system that is similar to UNIX. It is widely used in High Performance Computing.
See the Unix Basics tutorial for more information. There are also many tutorials available on the web.
Secure Shell (SSH) is a program to log into another computer over a network, to execute commands in a remote machine, and to move files from one machine to another. It provides strong authentication and secure communications over insecure channels. SSH provides secure X connections and secure forwarding of arbitrary TCP connections.
SSH works by the exchange and verification of information, using public and private keys, to identify hosts and users. The
ssh-keygencommand creates a directory ~/.ssh and files that contain your authentication information. The public key is stored in ~/.ssh/id_rsa.pub and the private key is stored in ~/.ssh/id_rsa. Share only your public key. Never share your private key. To further protect your private key you should enter a passphrase to encrypt the key when it is stored in the file system. This will prevent people from using it even if they gain access to your files.
One other important file is ~/.ssh/authorized_keys. Append your public keys to the authorized_keys file and keep the same copy of it on each system where you will make ssh connections.
In addition, on Owens the default SSH client config enables hashing of a user’s known_hosts file. So if SSH is used on Owens the remote system’s SSH key is added to ~/.ssh/known_hosts in a hashed format which can’t be unhashed. If the remote server’s SSH key changes, special steps must be taken to remove the SSH key entry:ssh-keygen -R <hostname>
The OSC OnDemand portal allows you to connect to our systems using your web browser, without having to install any software. You get a login shell and also the ability to transfer files.
Most file transfers are done using sftp (SSH File Transfer Protocol) or scp (Secure CoPy). These utilities are usually provided on Linux/UNIX and Mac platforms. Windows users should read the next section, "Where can I find SSH and SFTP clients".
There are many SSH and SFTP clients available, both commercial and free. See Getting Connected for some suggestions.
Graphics are handled using the X11 protocol. You’ll need to run an X display server on your local system and also set your SSH client to forward (or "tunnel") X11 connections. On most UNIX and Linux systems, the X server will probably be running already. On a Mac or Windows system, there are several choices available, both commercial and free. See our guide to Getting Connected for some suggestions.
OSC temporarily blacklists some IP addresses when multiple failed logins occur. If you are connecting from behind a NAT gateway, as is commonly used for public or campus wireless networks, and get a "connection refused" message it is likely that someone recently tried to connect multiple times and failed when connected to the same network you are on. Please contact OSC Help with your public IP address and the cluster you attempted to connect to and we will remove your IP from the blacklist. You can learn your public IP by searching for "what is my IP address" in Google.
Batch Processing Questions
On all OSC systems, batch processing is managed by the Simple Linux Utility for Resource Management system (Slurm). Slurm batch requests (jobs) are shell scripts that contain the same set of commands that you enter interactively. These requests may also include options for the batch system that provide timing, memory, and processor information. For more information, see our guide to Batch Processing at OSC.
squeueto check the status, and
scancelto delete a batch request. For more information, see our Batch-Related Command Summary.
Yes. See the
--mail-typeoption in our Slurm docoumentation. If you are submitting a large number of jobs, this may not be a good idea.
There are numerous reasons why a job might not run even though there appear to be processors and/or memory available. These include:
- Your account may be at or near the job count or processor count limit for an individual user.
- Your group/project may be at or near the job count or processor count limit for a group.
- The scheduler may be trying to free enough processors to run a large parallel job.
- Your job may need to run longer than the time left until the start of a scheduled downtime.
- You may have requested a scarce resource or node type, either inadvertently or by design.
See our Scheduling Policies and Limits for more information.
A batch job that terminates before the script is completed can still copy files from
$TMPDIRto the user's home directory via the use of signals handling. In the batch script, there should be an additional sbatch option added for
--signals. See Signal handling in job scripts for details.
If a command in a batch script is killed for excessive memory usage (see Out-of-Memory (OOM) or Excessive Memory Usage for details) then the handler may not be able to fully execute it's commands. However, normal shell scripting can handle this situation: the exit status of a command that may possibly cause an OOM can be checked and appropriate action taken. Here is a Bourne shell example:
bla_bla_big_memory_using_command_that_may_cause_an_OOM if [ $? -ne 0 ]; then cd$SLURM_SUBMIT_DIR
;cp -R $TMPDIR/*$SLURM_JOB_ID
Finally, if a node your job is running on crashes then the commands in the signal handler may not be executed. It may be possible to recover your files from batch-managed directories in this case. Contact OSC Help for assistance.
To delete all your jobs on one of the clusters, including those currently running, queued, and in hold, login to the cluster and run the command:scancel -u <username># current jobs queued/running and cpus requested squeue --cluster=all --account=<proj-code> --Format=jobid,partition,name,timeLeft,timeLimit,numCPUS # or for a user squeue --cluster=all -u <username> --Format=jobid,partition,name,timeLeft,timeLimit,numCPUS
By default, we don't start an X server on gpu nodes because it impacts computational performance. Add
visin your GPU request such that the batch system uses the GPUs for visualization. For example, on Owens, it should be--nodes=1 --ntasks-per-node=28 --gpus-per-node=1 --gres=vis
Compiling System Questions
Fortran, C, and C++ are available on all OSC systems. The commands used to invoke the compilers and/or loaders vary from system to system. For more information, see our Compilation Guide.
We have Intel, PGI, and gnu compilers available on all systems. Each compiler vendor supports some options that the other doesn’t, so the choice depends on your individual needs.For more information, see our Compilation Guide.
Most serial code built on one system will run on another system, although it may run more efficiently if it is built and run on the same system. Parallel (MPI) code typically must be built on the system where it will run.
One major difference is that OSC users cannot install software system wide using package managers. In general, users installing software in their home directories will follow the configure/build/test paradigm that is common on Unix-like operating systems.For more information, see our HOWTO: Locally Installing Software on an OSC cluster.
OSC users installing software on a cluster occasionally report this error. It is related to memory addressing and is usually fixed by cleaning the current build and rebuilding with the compiler option "-mcmodel=medium". For more details, see the man page for the compiler.
Parallel Processing Questions
Parallel processing is the simultaneous use of more than one computer (or processor) to solve a problem. There are many different kinds of parallel computers. They are distinguished by the kind of interconnection between processors or nodes (groups of processors) and between processors and memory.
On most systems, both shared-memory and distributed-memory parallel programming models can be used. Versions of OpenMP (for multithreading or shared-memory usage) and MPI (for message-passing or distributed-memory usage) are available. A summary of parallel environments will be coming soon.
A core is a processor. When a single chip contains multiple processors, they are called cores.
We are currently working on a guide for this. Please contact OSC Help for assistance.
See the Software section for more information.
Check the Software section to see what versions are installed. You can also check the installed modules using the
module spideror module avail command.
Please contact OSC Help for assistance.
See the Software section for information on third-party math libraries (e.g., MKL, ACML, fftw, scalapack, etc). MKL and ACML are highly optimized libraries that include the BLAS and LAPACK plus some other math routines.
The NumPy and SciPy modules are installed with the python software. See the Python software page.
You may install open source software yourself in your home directory. If you have your own license for commercial software, contact the OSC Help desk.
Most packages have a (poorly documented) option to install under a normal user account. Contact the OSC Help desk if you need assistance. We generally do not install user software as root.
Modules are used to manage the environment variable settings associated with software packages in a shell-independent way. On OSC's systems, you will by default have modules in your environment for PBS, MPI, compilers, and a few other pieces of software. For information on using the module system, see our guide to Batch Processing at OSC.
Performance Analysis Questions
MegaFLOPS/GigaFLOPS/TeraFLOPS/PetaFLOPS are millions/billions/trillions/quadrillions of FLoating-point Operations (calculations) Per Second.
A number of performance analysis tools are available on OSC systems. Some are general to all systems and others are specific to a particular system. See our performance analysis guide for more info.
There are several ways to optimize code. Key areas to consider are CPU optimization, I/O optimization, memory optimization, and parallel optimization. See our optimization strategy guide for more info.
Other Common Problems
Programs run on the login nodes are subject to strict CPU time limits. To run an application that takes more time, you need to create a batch request. Your batch request should include an appropriate estimate for the amount of time that your application will need. See our guide to Batch Processing at OSC for more information.
Programs run on the login nodes are subject to strict CPU time limits. Because file transfers use encryption, you may hit this limit when transferring a large file. To run longer programs, use the batch system. To transfer larger files, connect to sftp.osc.edu instead of to a login node.
This is most commonly caused by trying to access an array beyond its bounds -- for example, trying to access element 15 of an array with only 10 elements. Unallocated arrays and invalid pointers are other causes. You may wish to debug your program using one of the available tools such as the TotalView Debugger.
I created a batch script in a text editor on a Windows or Mac system, but when I submit it on an OSC system, almost every line in the script gives an error. Why is that?
Windows and Mac have different end-of-line conventions for text files than UNIX and Linux systems do, and most UNIX shells (including the ones interpreting your batch script) don't like seeing the extra character that Windows appends to each line or the alternate character used by Mac. You can use the following commands on the Linux system to convert a text file from Windows or Mac format to UNIX format:
A text file created on Linux/UNIX will usually display correctly in Wordpad but not in Notepad. You can use the following command on the Linux system to convert a text file from UNIX format to Windows format:
See our knowledge base article on the topic.