HOWTO: Use GPU with Tensorflow and PyTorch

GPU Usage on Tensorflow

Environment Setup

To begin, you need to first create and new conda environment or use an already existing one. See HOWTO: Create  Python Environment for more details. In this example we are using python/3.6-conda5.2

Once you have a conda environment created and activated we will now install tensorflow-gpu into the environment (In this example we will be using version 2.4.1 of tensorflow-gpu:

conda install tensorflow-gpu=2.4.1

 

Verify GPU accessability (Optional):

Now that we have the environment set up we can check if tensorflow can access the gpus.

To test the gpu access we will submit the following job onto a compute node with a gpu:

#!/bin/bash
#SBATCH --account <Project-Id>
#SBATCH --job-name Python_ExampleJob
#SBATCH --nodes=1
#SBATCH --time=00:10:00
#SBATCH --gpus-per-node=1


module load python/3.6-conda5.2 cuda/11.8.0

source activate tensorflow_env


# run either of the following commands

python << EOF 
import tensorflow as tf 
print(tf.test.is_built_with_cuda()) 
EOF

python << EOF
from tensorflow.python.client import device_lib
print(device_lib.list_local_devices())
EOF


You will know tensorflow is able to successfully access the gpu if tf.test.is_built_with_cuda() returns True and device_lib.list_local_devices() returns an object with /device:GPU:0 as a listed device.

At this point tensorflow-gpu should be setup to utilize a GPU for its computations.

 

GPU vs CPU

A GPU can provide signifcant performace imporvements to many machine learnings models. Here is an example python script demonstrating the performace improvements. This is ran on the same environment  created in the above section.

from timeit import default_timer as timer
import tensorflow as tf
from tensorflow import keras
import numpy as np


(X_train, y_train), (X_test, y_test) = keras.datasets.cifar10.load_data()


# scaling image values between 0-1
X_train_scaled = X_train/255
X_test_scaled = X_test/255

# one hot encoding labels
y_train_encoded = keras.utils.to_categorical(y_train, num_classes = 10)
y_test_encoded = keras.utils.to_categorical(y_test, num_classes = 10)

def get_model():
    model = keras.Sequential([
        keras.layers.Flatten(input_shape=(32,32,3)),
        keras.layers.Dense(3000, activation='relu'),
        keras.layers.Dense(1000, activation='relu'),
        keras.layers.Dense(10, activation='sigmoid')    
    ])

    model.compile(optimizer='SGD',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
    return model

# GPU
with tf.device('/GPU:0'):
    start = timer()
    model_cpu = get_model()
    model_cpu.fit(X_train_scaled, y_train_encoded, epochs = 1)
    end = timer()


print("GPU time: ", end - start)

# CPU
with tf.device('/CPU:0'):
    start = timer()
    model_gpu = get_model()
    model_gpu.fit(X_train_scaled, y_train_encoded, epochs = 1)
    end = timer()

print("CPU time: ", end - start)

Example code sampled from here

The above code was then submitted in a job with the following script:

#!/bin/bash 
#SBATCH --account <Project-Id> 
#SBATCH --job-name Python_ExampleJob 
#SBATCH --nodes=1 
#SBATCH --time=00:10:00 
#SBATCH --gpus-per-node=1 

module load python/3.6-conda5.2 cuda/11.8.0

source activate tensorflow_env

python tensorflow_example.py
Make sure you request a gpu! For more information see GPU Computing

As we can see from the output, the GPU provided a signifcant performace improvement.

GPU time:  3.7491355929996644

CPU time:  78.8043485119997

 

Usage on Jupyter

If you would like to use a gpu for your tensorflow project in a jupyter notebook follow the below commands to set up your environment.

To begin, you need to first create and new conda environment or use an already existing one. See HOWTO: Create  Python Environment for more details. In this example we are using python/3.6-conda5.2

Once you have a conda environment created and activated we will now install tensorflow-gpu into the environment (In this example we will be using version 2.4.1 of tensorflow-gpu:

conda install tensorflow-gpu=2.4.1

Now we will setup a jupyter kernel. See HOWTO: Use a Conda/Virtual Environment With Jupyter for details on how to create a jupyter kernel with your conda environment.

Once you have the kernel created see Usage section of Python page for more details on accessing the Jupyter app from OnDemand. 

When configuring your notebook make sure to select a GPU enabled node and a cuda version.

Screenshot 2023-08-22 at 11.30.53 AM.jpeg

Now you are all setup to use a gpu with tensorflow on a juptyer notebook.

 

GPU Usage on PyTorch

Environment Setup

To begin, you need to first create and new conda environment or use an already existing one. See HOWTO: Create  Python Environment for more details. In this example we are using python/3.6-conda5.2

Once you have a conda environment created and activated we will now install pytorch into the environment (In the example we will be using version 1.3.1 of pytorch:

conda install pytorch=1.3.1

 

Verify GPU accessability (Optional):

Now that we have the environment set up we can check if pytorch can access the gpus.

To test the gpu access we will submit the following job onto a compute node with a gpu:

#!/bin/bash
#SBATCH --account <Project-Id>
#SBATCH --job-name Python_ExampleJob
#SBATCH --nodes=1
#SBATCH --time=00:10:00
#SBATCH --gpus-per-node=1


ml python/3.6-conda5.2 cuda/11.8.0

source activate pytorch_env


python << EOF
import torch
print(torch.cuda.is_available())
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)
EOF

You will know pytorch is able to successfully access the gpu if torch.cuda.is_available() returns True and torch.device("cuda:0" if torch.cuda.is_available() else "cpu") returns cuda:0 .

At this point PyTorch should be setup to utilize a GPU for its computations.

 

GPU vs CPU

Here is an example pytorch script demonstrating the performace improvements from GPUs

import torch
from timeit import default_timer as timer


# check for cuda availability
print("Cuda: ", torch.cuda.is_available())
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print("Device: ", device)


#GPU 
b = torch.ones(4000,4000).cuda() # Create matrix on GPU memory
start_time = timer() 
for _ in range(1000): 
    b += b 
elapsed_time = timer() - start_time 

print('GPU time = ',elapsed_time)


#CPU
a = torch.ones(4000,4000) # Create matrix on CPU memory
start_time = timer()
for _ in range(1000):
    a += a
elapsed_time = timer() - start_time

print('CPU time = ',elapsed_time)


 

The above code was then submitted in a job with the following script:

#!/bin/bash 
#SBATCH --account <Project-Id> 
#SBATCH --job-name Python_ExampleJob 
#SBATCH --nodes=1 
#SBATCH --time=00:10:00 
#SBATCH --gpus-per-node=1 

ml python/3.6-conda5.2 cuda/11.8.0

source activate pytorch_env

python pytorch_example.py
Make sure you request a gpu! For more information see GPU Computing

As we can see from the output, the GPU provided a signifcant performace improvement.

GPU time =  0.0053490259997488465

CPU time =  4.232843188998231

 

Usage on Jupyter

If you would like to use a gpu for your PyTorch project in a jupyter notebook follow the below commands to set up your environment.

To begin, you need to first create and new conda environment or use an already existing one. See HOWTO: Create  Python Environment for more details. In this example we are using python/3.6-conda5.2

Once you have a conda environment created and activated we will now install pytorch into the environment (In the example we will be using version 1.3.1 of pytorch:

conda install pytorch=1.3.1

You also may need to install numba for PyTorch to access a gpu from the jupter notebook.

conda install numba=0.54.1

 

Now we will setup a jupyter kernel. See HOWTO: Use a Conda/Virtual Environment With Jupyter for details on how to create a jupyter kernel with your conda environment.

Once you have the kernel created see Usage section of Python page for more details on accessing the Jupyter app from OnDemand. 

When configuring your notebook make sure to select a GPU enabled node and a cuda version.

Screenshot 2023-08-22 at 11.30.53 AM.jpeg

Now you are all setup to use a gpu with PyTorch on a juptyer notebook.

Horovod

If you are using Tensorflow or PyTorch you may want to also consider using Horovod. Horovod will take single-GPU training scripts and scale it to train across many GPUs in parallel.

 

Supercomputer: