Our HOWTO collection contains short tutorials that help you step through some of the common (but potentially confusing) tasks users may need to accomplish, that do not quite rise to the level of requiring more structured training materials. Items here may explain a procedure to follow, or present a "best practices" formula that we think may be helpful.
In some cases, it may be necessary for you to use the MATLAB Parallel Computing Toolbox to remotely connect to OSC resources, whether to run parallel jobs in MATLAB or to use toolboxes for which you own your own licenses. This guide will explain the basics for how to configure your MATLAB installation in order to connect remotely to Oakley using MATLAB 2013a.
The first step is to download the necessary configuration files. Click the link below to download the files. Be sure to save the files in a convenient location that you can easily remember.
When you open MATLAB, click the "Parallel" dropdown menu from the "Environment" menu and select "Manage Cluster Profiles". At this time, a new window should open displaying the Cluster Profile Manager.
In the Cluster Profile Manager window, click the "Import" button and locate the directory containing your configuration files using the file browser.
Select the file: "genericNonSharedOakleyIntel.settings" and click "Open".
Then, in the Cluster Profile Manager window, you will need to modify some of the properties of the Cluster Profile that you just imported. Select "generic NonSharedOakleyIntel" from the list of cluster profiles, and click the "Edit" button in the lower right-hand corner of this window to enable editing.
In the editing window under "Submit Functions", you should see two entries -- IndependentSubmitFcn and CommunicatingSubmitFcn. In these entries, you will need to change the directory path provided to a directory of your choice within your home directory on OSC systems. This will be the destination for log files and intermediate data created as a result of submitting a job using the Parallel Computing Toolbox. These locations are not intended to be the destination for your results. Once these have been changed, click "Done" and close the Cluster Profile Manager window.
In the directory of configuration files, the file called "testremote.m" is the entry point for job submission using the Parallel Computing Toolbox. In this file, you will need to modify the "batch" command in order to run your particular MATLAB program. How this command is modified depends largely on whether you want to run a serial or parallel job. At the very least, you will need to provide a function or script name to be executed. For more information about the "batch" command and its various forms and arguments, see the Mathworks documentation for "batch".
Your results will not automatically be offloaded from the cluster when your job completes. In order to obtain the results of your calculations, you will need to save the relevant variables from your workspace in a .mat file using the "save" command in MATLAB. For more information about the "save" command and its various forms and arguments, see the Mathworks documentation for "save".
While we provide a number of Python modules, you may need a module we do not provide. If it is a commonly used module, or one that is particularly difficult to compile, you can contact OSC Help for assistance, but we have provided an example below showing how to build and install your own Python modules, and make them available inside of Python. Note, these instructions use "bash" shell syntax; this is our default shell, but if you are using something else (csh, tcsh, etc), some of the syntax may be different.
First, you need to collect up what you need in order to do the installation. To keep things tidy, we will do all of our work in
~/local/src. You should make this directory now.
mkdir -p ~/local/src
Now, we will need to download the source code for the module we want to install. In our example, we will use "NumExpr", a module we already provide in the system version of Python. You can either download the file to your desktop, and then upload it to OSC, or directly download it using the
wget utility (if you know the URL for the file).
cd ~/local/src wget http://numexpr.googlecode.com/files/numexpr-2.0.1.tar.gz
Now, extract the downloaded file. In this case, since it's a "tar.gz" format, we can use tar to decompress and extract the contents.
tar xvfz numexpr-2.0.1.tar.gz
You can delete the downloaded archive now, if you wish, or leave it around should you want to start the installation from scratch.
To build the module, we will want to first create a temporary environment variable to aid in installation. We'll call it "INSTALL_DIR".
I am following, roughly, the convention we use at the system level. This allows us to easily install new versions of software without risking breaking anything that uses older versions. As you can see, I have specified a folder for the program (numexpr), and for the version (2.0.1). Now, to be consistent with python installations, we're going to create a second temporary environment variable, which will contain the actual installation location.
Now, make the directory tree.
mkdir -p $TREE
To compile the module, we should switch to the GNU compilers. The system installation of Python was compiled with the GNU compilers, and this will help avoid any unnecessary complications. We will also load the Python module, if it hasn't already been loaded.
module swap intel gnu module load python.
Now, build it. This step may vary a bit, depending on the module you are compiling. You can execute
python setup.py --help to see what options are available. Since we are overriding the install path to one that we can write to, and that fits our management plan, we need to use the
python setup.py install --prefix=$INSTALL_DIR
At this point, the module is compiled and installed in
~/local/numexpr/2.0.1/lib/python2.7/site-packages. Occasionally, some files will be installed in
~/local/numexpr/2.0.1/bin as well. To ensure Python can locate these files, we need to modify our environment.
The most immediate way - but the one that must be repeated every time you wish to use the module - is to manually modify your environment. If files are installed in the "bin" directory, you'll need to add it to your path. As before, these examples are for bash, and may have to be modified for other shells. Also, you will have to modify the directories to match your install location.
And, for the python libraries:
We don't really recommend this option, as it is less flexible, and can cause conflicts with system software. But, if you want, you can modify your .bashrc (or similar file, depending on your shell) to set these environment variables automatically. Be extra careful; making a mistake in .bashrc (or similar) can destroy your login environment in a way that will require a system administrator to fix. To do this, you can copy the lines above modifying
$PYTHONPATH into .bashrc. Remember - test them interactively first! If you destroy your shell interactively, the fix is as simple as logging out and then logging back in. If you break your login environment, you'll have to get our help to fix it.
This is the most complicated option, but it is also the most flexible, as you can have multiple versions of this particular software installed, and specify at run-time which one to use. This is incredibly useful if a major feature changes that would break old code, for example. You can see our tutorial on writing modules here, but the important variables to modify are, again,
$PYTHONPATH. You should specify the complete path to your home directory here, and not rely on any shortcuts like
prepend-path PYTHONPATH /nfs/10/guilfoos/local/oakley/numexpr/2.0.1/lib/python2.7/site-packages prepend-path PATH /nfs/10/guilfoos/local/oakley/numexpr/2.0.1/bin
Once your module is created (again, see the guide), you can use your python module simply by loaded the software module you created.
module load numexpr/2.0.1
Sometimes the best way to get access to a piece of software on the HPC systems is to install it yourself as a "local install". This document will walk you through the OSC-recommended procedure for maintaining local installs in your home directory or project space.
Before installing your software, you should first prepare a place for it to live. We recommend the following directory structure, which you should create in the top-level of your home directory:
local |-- src | `-- tars `-- share `-- modulefiles
This structure is quite common in the UNIX world, and is also how OSC organizes the software we provide. Each directory serves a specific purpose:
local- Gathers all the files related to your local installs into one directory, rather than cluttering your home directory. Applications will be installed into this directory with the format "appname/version". This allows you to easily store multiple versions of a particular software install if necessary.
local/src- Stores the installers -- generally source directories -- for your software.
local/src/tars- Stores the compressed archives ("tarballs") of your installers. Useful if you want to reinstall later using different build options.
local/share/modulefiles- The standard place to store module files, which will allow you to dynamically add or remove locally installed applications from your environment.
You can create this structure with one command. After navigating to where you want to create the directory structure, run:
mkdir -p local/src/tars local/share/modulefiles
Finally, you need to add your local
modulefiles directory to the module system's search path. To do this, append the following line to your
.tcshrc, or other shell startup script:
module use /nfs/01/username/local/share/modulefiles
/nfs/01/username" with the full path of your home directory. (You can identify this from the command line with the command "
echo $HOME".) If you already have a .modulerc file, just add the "
module use" line to the end of it.
Now that you have your directory structure in space, you can install your software. For demonstration purposes, we will install a local copy of the Git version control system.
First, we need to get the source code onto the HPC filesystem. The easiest thing to do is find a download link, copy it, and use the
wget tool to download it on the HPC. We'll download this into
cd ~/local/src/tars wget http://git-core.googlecode.com/files/git-220.127.116.11.tar.gz
Now extract into the
src directory above. If you're working with a tar file, you can use the
-C command to specify the directory to extract to:
tar zxvf git-18.104.22.168.tar.gz -C ../
Next, we'll go into the source directory and build the program. Consult your application's documentation to determine how to specify to install into
app with the application's name and
version with the version you are installing, as demonstrated below. In this case, we'll use the
--prefix option to specify the install location.
You'll also want to specify a few variables to help make your application more compatible with our systems. We recommend specifying that you wish to use the Intel compilers and that you want to link the Intel libraries statically. This will prevent you from having to have the Intel module loaded in order to use your program. To accomplish this, add "
CC=icc CFLAGS=-static-intel" to the end of your invocation of
configure. If your application does not use
configure, you can generally still set these variables somewhere in its Makefile or build script.
With these things in mind, we can build Git using the following commands:
cd ../git-22.214.171.124 ./configure --prefix=$HOME/local/git/126.96.36.199 CC=icc CFLAGS=-static-intel make && make install
Your application should now be fully installed. However, before you can use it you will need to add the installation's directories to your path. To do this, you will need to create a module.
Modules allow you to dynamically alter your environment to define environment variables and bring executables, libraries, and other features into your shell's search paths. They are written in the Tcl language, though you do not need to be familiar with it to create a simple module.
All modules begin with the string "#%Module". After that, they contain several commands to tell the module system how to modify your environment. Some of the commonly used ones are:
prepend-path VARIABLE path- Adds
pathto the beginning of
VARIABLEis a colon-separated list of paths. Generally use to modify
setenv VARIABLE value- Sets the environment variable
set VARIABLE value- Used to set local variables to be used within the module.
You can read about all of the available commands by reading the manpage for "modulefile":
A simple module for our Git installation would be:
#%Module ## Local variables set name git set version 188.8.131.52 set root /nfs/01/username/local/$name/$version ## Environment modifications # Set basic paths prepend-path PATH $root/bin prepend-path MANPATH $root/share/man # Git includes some Python and Perl modules that may be useful prepend-path PERL5LIB $root/lib prepend-path PERL5LIB $root/lib64 prepend-path PYTHONPATH $root/lib
Any modulefile you create you should be saved into your local modulefiles directory. For maximum future-proofing, create a subdirectory within modulefiles named after your app and add one modulefile to that directory for each version of the app installed.
In the case of our Git example, you would create the directory
$HOME/local/share/modulefiles/git and create a modulefile within that directory named "
184.108.40.206". To make this module usable, you need to tell the modules utility where to look for it. You can do this by issuing the command
module use $HOME/local/share/modulefiles, in our example. This will allow you to load your app using either
module load git or
module load git/220.127.116.11. If you installed version 1.8 later on and created a modulefile for it called "
1.8", the module system would automatically load the newer version whenever you loaded
git. If you needed to go back to the older version for some reason, you can do so by specifying the version you wanted:
module load git/18.104.22.168.
module use [/path/to/modulefiles]is not persistent between sessions.
For a starting point, copy our sample modulefile from
~support/doc/modules/sample_module. This modulefile follows the recommended design patterns laid out above, and includes samples of many common module operations.
For more information about modules, be sure to read the module(1) and modulefile(4) manpages. If you have any questions about modules or local installations, feel free to contact the OSC Help Desk and firstname.lastname@example.org.
Globus Connect is a reliable, high-performance file transfer platform allowing users to transfer large amounts of data seamlessly between systems. It aims to make transfers a "click-and-forget" process by setting up configuration details in the background and automating fault recovery.
Globus can be used for both file transfers between OSC and:
Users transfering between OSC and another computing insitution with Globus installed do not need to install Globus Connect Personal or add OSC certificate authority files, and can skip to Step 3.
More on how Globus works can be found on the Globus "How It Works" page.
To use Globus to transfer from a personal computer, you will need to:
Those transfering between OSC and another computing insitution can skip to Step 3.
globusconnect, found within the unzipped directory
By default Globus will only add certain default folders to the list of files and directories accessible by Globus. To change/add/remove files and directories from this list:
This process will add Certificate Authority files to your computer to ensure you are a trusted endpoint. If you installed Globus on a Windows computer, and also installed it to the default directory, we have provided this as a automated process. Linux and Mac installations, as well as Windows installations to non-default locations, will need to follow the manual instructions below.
Copy was successful!
Copy was unsuccessful!
You will need to follow the manual instructions below.
/Applications/Globus Connect Personal.app/Contents/MacOS/
These steps will create a short lived certificate on OSC's systems to ensure you are a valid OSC user. This certificate will be good for 0.5 days (11 hours), after which you will need to repeat this process to transfer to/from OSC again. Repeating this process will overwrite your old certificate, meaning you will need to change the Credentials for osc#Glenn on the Globus website imediately after.
Adding InCommon authentication to your Globus account allows you to login to Globus Online using your university credentials. Using this process you can store your Globus username password for safe keeping, and instead use your university username and password to login. If your already logged in to your university authentication system, logging in to Globus can be as simple as two clicks away.
To use this feature, your university needs to be a InCommon participant. Some Ohio universities active in InCommon include: Ohio State University, Case Western University, Columbus State Community College, Miami University, Ohio Northern University, Ohio University, University of Findlay, University of Dayton, and many more.
For a complete list, visit https://incommon.org/participants/ .
When you go to login next, click "alternative login" and then "InCommon / CILogon". Select your university on the next page, and login using your university credentials. Globus will remember this preference, and automatically prompt you to login using your university authentication next time.
SSHing directly to a compute node at OSC - even if that node has been assigned to you in a current batch job - and starting VNC is an "unsafe" thing to do. When your batch job ends (and the node is assigned to other users), stray processes will be left behind and negatively impact other users. However, it is possible to use VNC on compute nodes safely.
The examples below are for Oakley.
Step one is to create your VNC server inside a batch job.
The preferred method is to start an interactive job, requesting an entire node, and then once your job starts, you can start the VNC server.
qsub -I -l nodes=1:ppn=12:gpus=2:vis
This command requests an entire GPU node, and tells the batch system you wish to use the GPUs for visualization. This will ensure that the X11 server can access the GPU for acceleration. In this example, I have not specified a duration, which will then default to 1 hour.
module load virtualgl module load turbovnc
Then start your VNC server. (The first time you run this command, it may ask you for a password - this is to secure your VNC session from unauthorized connections. Set it to whatever password you desire. We recommend a strong password.)
The output of this command is important: it tells you where to point your client to access your desktop. Specifically, we need both the host name (before the :), and the screen (after the :).
New 'X' desktop is n0302.ten.osc.edu:1
This option is less optimal, because it is slightly more difficult to get the hostname and screen. However, by submitting a non-interactive batch job, you can go away and have the system email you when your desktop is ready to be connected to, and more importantly if your SSH connection to OSC is somewhat unstable and intermittent, you do not run the risk of being disconnected during your interactive session and having your VNC server terminated. In general, it is recommended you only use this option if running via an interactive session is not feasible.
In order to start an VNC session non-interactively, you can submit the following script to the scheduler using qsub (adjusting your walltime to what you need):
#PBS -l nodes=1:ppn=12:gpus=2:vis #PBS -l walltime=00:15:00 #PBS -m b #PBS -N VNCjob #PBS -j oe module load virtualgl module load turbovnc vncserver sleep 100 vncpid=`pgrep -s 0 Xvnc` while [ -e /proc/$vncpid ]; do sleep 0.1; done
This script will send you an email when your job has started, which includes the hostname.
PBS Job Id: 935621.oak-batch.osc.edu Job Name: VNCjob Exec host: n0282/11+n0282/10+n0282/9+n0282/8+n0282/7+n0282/6+n0282/5+n0282/4+n0282/3+n0282/2+n0282/1+n0282/0 Begun execution
The screen is virtually always "1", unless someone else started a VNC server on that node outside of the batch system. You can verify the output of the vncserver command by using qpeek on a login node:
Where "jobid" is the batch system job number, for example, "935621".
Because the compute nodes of our clusters are not directly accessible, you must log in to one of the login nodes and allow your VNC client to "tunnel" through SSH to the compute node. The specific method of doing so may vary depending on your client software.
I will be providing the basic command line syntax, which works on Linux and MacOS. You would issue this in a new terminal window on your local machine, creating a new connection to Oakley.
ssh -L 5901:n0302.ten.osc.edu:5901 email@example.com
Open your VNC client, and connect to "localhost:1" - this will tunnel to the correct node on Oakley.
This example uses Chicken of the VNC, a MacOS VNC client.
The default window that comes up for Chicken requires the host to connect to, the screen (or port) number, and optionally allows you to specify a host to tunnel through via SSH. This screenshot shows a proper configuration for the output of vncserver shown above. Substitute your host, screen, and username as appropriate.
When you click [Connect], you will be prompted for your HPC password (to establish the tunnel, provided you did not input it into the "password" box on this dialog), and then (if you set one), for your VNC password. If your passwords are correct, the desktop will display in your client.