TensorFlow

"TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code."

Quote from: https://github.com/tensorflow/tensorflow

Availability and Restrictions

Versions

The following version of TensorFlow is available on OSC clusters:

VERSION

OWENS

 v0.10.0rc0 (using Python 2.7)

X

 

The current version of TensorFlow on Owens requires cuda/8.0.44 for GPU calculations.  TensorFlow is a Python package and therefore requires loading module load python/2.7. The version of TensorFlow may actively change with updates to Anaconda Python on Owens so that you can check the latest version with conda list tensorflow

Feel free to contact OSC Help if you need other versions for your work.

Access 

TensorFlow is available to all OSC users without restriction.

Usage on Owens

Usage on Owens

Setup on Owens

TensorFlow package is installed using Anaconda Python 2.  To configure the Owens cluster for the use of TensorFlow, use the following commands:

module load python/2.7 cuda/8.0.44

Batch Usage on Ruby or Owens

Batch jobs can request multiple nodes/cores and compute time up to the limits of the OSC systems. Refer to Queues and Reservations for Owens, and Scheduling Policies and Limits for more info.  In particular, TensorFlow should be run on a GPU-enabled compute node.

An Example of Using  TensorFlow with MNIST model and Logistic Regression

Below is an example batch script ( job.txt and logistic_regression_on_mnist.py) for using TensorFlow.

Contents of job.txt

#PBS -N TensorFlow
#PBS -l nodes=1:ppn=28:gpus=1:default
#PBS -l walltime=30:00
#PBS -j oe
#PBS -S /bin/bash 

cd $PBS_O_WORKDIR
module load python/2.7 cuda/8.0.44
python logistic_regression_on_mnist.py

Contents of logistic_regression_on_mnist.py

# logistic_regression_on_mnist.py Python script based on:
# https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/0_Prerequisite/mnist_dataset_intro.ipynb
# https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/2_BasicModels/logistic_regression.ipynb

import tensorflow as tf

# Import MNIST
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("data/", one_hot=True)

# Parameters
learning_rate = 0.01
training_epochs = 25
batch_size = 100
display_step = 1

# tf Graph Input
x = tf.placeholder(tf.float32, [None, 784]) # mnist data image of shape 28*28=784
y = tf.placeholder(tf.float32, [None, 10]) # 0-9 digits recognition => 10 classes

# Set model weights
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))

# Construct model
pred = tf.nn.softmax(tf.matmul(x, W) + b) # Softmax

# Minimize error using cross entropy
cost = tf.reduce_mean(-tf.reduce_sum(y*tf.log(pred), reduction_indices=1))
# Gradient Descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

# Initializing the variables
init = tf.global_variables_initializer()
# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_xs, batch_ys = mnist.train.next_batch(batch_size)
            # Fit training using batch data
            _, c = sess.run([optimizer, cost], feed_dict={x: batch_xs,
                                                          y: batch_ys})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if (epoch+1) % display_step == 0:
            print "Epoch:", '%04d' % (epoch+1), "cost=", "{:.9f}".format(avg_cost)

    print "Optimization Finished!"

    # Test model
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    # Calculate accuracy for 3000 examples
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
    print "Accuracy:", accuracy.eval({x: mnist.test.images[:3000], y: mnist.test.labels[:3000]})

In order to run it via the batch system, submit the job.txt  file with the following command:

qsub job.txt

Further Reading

https://www.tensorflow.org/

Service: 
Technologies: 
Fields of Science: