Torch

"Torch is a deep learning framework with wide support for machine learning algorithms. It's open-source, simple to use, and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C / CUDA implementation. Torch offers popular neural network and optimization libraries that are easy to use, yet provide maximum flexibility to build complex neural network topologies. It also runs up to 70% faster on the latest NVIDIA Pascal™ GPUs, so you can now train networks in hours, instead of days."

Quote from: http://www.nvidia.com/object/torch-library.html

Availability and Restrictions

Versions

The following version of Torch is available on OSC cluster:

 

VERSION

OWENS

Torch7

X

 

The current version of Torch on Owens requires cuda/8.0.44 and CUDNN v5 for GPU calculations.

Feel free to contact OSC Help if you need other versions for your work.

Access 

Torch is available to all OSC users without restriction.

Usage

Usage on Owens

Setup on Owens

To configure the Owens cluster for the use of Torch, use the following commands:

module load torch

Batch Usage on Ruby or Owens

Batch jobs can request multiple nodes/cores and compute time up to the limits of the OSC systems. Refer to Queues and Reservations for Owens, and Scheduling Policies and Limits for more info.  In particular, Torch should be run on a GPU-enabled compute node.

An Example of Using Torch with CIFAR10 Training Data on Owens

Below is an example batch script ( job.txt) for using Torch. Please see the reference https://github.com/szagoruyko/cifar.torch for more details.

#PBS -N Torch
#PBS -l nodes=1:ppn=28:gpus=1:default
#PBS -l walltime=30:00
#PBS -j oe
# Load module load for torch
module load torch
# Migrate to job temp directory 
cd $TMPDIR
# Clone sample data and scripts
git clone https://github.com/szagoruyko/cifar.torch.git .
# Run the image preprocessing (not necessary for subsequent runs, just re-use provider.t7)
OMP_NUM_THREADS=28 th -i provider.lua <<Input
provider = Provider()
provider:normalize()
torch.save('provider.t7',provider)
exit
y
Input
# Run the torch training
th train.lua --backend cudnn
# Copy results from job temp directory
cp -a * $PBS_O_WORKDIR

In order to run it via the batch system, submit the job.txt  file with the following command:

qsub job.txt
 
Further Reading

http://torch.ch/

Service: 
Technologies: 
Fields of Science: