Over the past two weeks we have experienced Oakely login node crashes potentially caused by a Lustre bug.


CUDA™ (Compute Unified Device Architecture) is a parallel computing platform and programming model developed by Nvidia that enables dramatic increases in computing performance by harnessing the power of the graphics processing unit (GPU).

Availability and Compatability

CUDA is available on Ruby, Oakley, and Glenn Clusters. The versions currently available at OSC are

Version Glenn Oakley Ruby
2.3 X    
3.0 X    
3.1 X    
4.0 X    
4.1.28   X  
4.2.9   X  
5.0.35   X X
5.5 X X X
6.0.37     X
6.5.14     X



CUDA is available for use by all OSC users.


Use module avail to view available modules for a given machine. To load the appropriate CUDA module, type: module load software-name.
For example: To select CUDA version 4.1.28 on Oakley, type: module load cuda/4.1.28

GPU Computing SDK

The NVIDIA GPU Computing SDK provides hundreds of code samples and covers a wide range of applications/techniques to help you get started on the path of writing software with CUDA C/C++ or DirectCompute. On Oakley, the SDK binaries are located in $CUDA_HOME/bin/linux/release ($CUDA_HOME is an environment variable set when you load the module).

Programming in CUDA

Please visit the following link to learn programming in CUDA, http://developer.nvidia.com/cuda-education-training. The link also contains tutorials on Optimizing CUDA codes to obtain greater SpeedUp. One can also refer to the following webpage for some more CUDA optimization techniques, http://www.cs.berkeley.edu/~volkov/

Compiling CUDA Code

One can type module show cuda/version-number to view the list of environment variables.
To compile a cuda code contained in a file, let say mycudaApp.cu, the following could be done after loading the appropriate CUDA module:
nvcc -o mycudaApp mycudaApp.cu
This will create an executable by name mycudaApp

Important: The devices are configured in exclusive mode. This means that 'cudaSetDevice' should NOT be used if requesting one GPU resource. Once the first call to CUDA is executed, the system will figure out which device it is using. If both cards per node is in use by a single application, please use 'cudaSetDevice'.

Debugging CUDA code

cuda-gdb can be used to debug CUDA codes. module load cuda will make it available to you. For more information on how to use the CUDA-GDB please visit http://developer.nvidia.com/cuda-gdb.

Detecting memory access errors

CUDA-MEMCHECK could be used for detecting the source and cause of memory access errors in your program. For more information on how to use CUDA-MEMCHECK please visit http://developer.nvidia.com/cuda-memcheck.

Batch Usage

Following are the sample batch scripts for requesting GPU nodes on Glenn and Oakley. Notice that only the second line is different in the two batch scripts. In case of Oakley one can specify the number of GPUs required.

Sample Batch Script (Glenn)

#PBS -l walltime=01:00:00
#PBS -l nodes=1:ppn=8:gpu
#PBS -N compute
#PBS -j oe
module load cuda
cd $HOME/cuda
cp mycudaApp $TMPDIR

Sample Batch Script (Oakley)

#PBS -l walltime=01:00:00
#PBS -l nodes=1:ppn=1:gpus=1
#PBS -N compute
#PBS -j oe
module load cuda
cd $HOME/cuda
cp mycudaApp $TMPDIR

Sample Batch Script (Ruby)

#PBS -l walltime=01:00:00
#PBS -l nodes=1:ppn=20:gpus=1
#PBS -N compute
#PBS -j oe
module load cuda
cd $HOME/cuda
cp mycudaApp $TMPDIR

For an interactive batch session one can run the following command:

On Glenn
qsub -I -l nodes=1:ppn=8:gpu -l walltime=00:20:00

On Oakley
qsub -I -l nodes=1:ppn=1:gpus=1 -l walltime=00:20:00

Please note that on Oakley, you can request any mix of ppn and gpus you need; please see the Job Scripts page in our batch guide for more information.

On Ruby
qsub -I -l nodes=1:ppn=20:gpus=1 -l walltime=00:20:00

Setting the GPU compute mode on Ruby

The GPUs on Ruby can be set to different compute modes as listed here.

They can be set by adding the following to the GPU specification:

-l nodes=1:ppn=20:gpus=1:default

-l nodes=1:ppn=20:gpus=1:exclusive

-l nodes=1:ppn=20:gpus=1:exclusive_process

Note, the prohibited mode is not an option.

Further Reading

Online documentation is available at http://developer.nvidia.com/nvidia-gpu-computing-documentation

See Also

Fields of Science: