HOWTO

Our HOWTO collection contains short tutorials that help you step through some of the common (but potentially confusing) tasks users may need to accomplish, that do not quite rise to the level of requiring more structured training materials. Items here may explain a procedure to follow, or present a "best practices" formula that we think may be helpful.

Service: 

HOW TO: Look at requested time accuracy using XDMoD

The XDMoD tool at xdmod.osc.edu can be used to get an overview of how accurate the requested time of jobs are with the elapsed time of jobs.

One way of specifying a time request is:

#SBATCH --time=xx:xx:xx

The elapsed time is how long the job ran for before completing. This can be obtained using the  sacct command.

$ sacct -u <username> --format=jobid,account,elapsed

It is important to understand that the requested time is used when scheduling a submitted job. If a job requests a time that is much more than the expected elapsed time, then it may take longer to start because the resources need to be allocated for the time that the job requests even if the job only uses a small portion of that requested time.

This allows one to view the requested time accuracy for an individual job, but XDMoD can be used to do this for jobs submitted in over a time range.

First, login to xdmod.osc.edu, see this page for more instructions.

https://www.osc.edu/supercomputing/knowledge-base/xdmod_tool

Then, navigate to the Metric Explorer tab.

Look for the Metric Catalog on the left side of the page and expand the SUPREMM options. Select Wall Hours: Requested: Per Job and group by None.

walltime_acc_metric_tab.png

This will now show the average time requested.

The actual time data can be added by navigating to Add Data -> SUPREMM -> Wall Hours: Per Job.

walltime_acc_add_data.png

walltime_acc_select_walltime.png

This will open a new window titled Data Series Definition, to change some parameters before showing the new data. In order to easily distinguish between elapsed and requested time, change the Display Type to Bar, then click add to view the new data.

walltime_add_data_settings.png

Now there is a line which shows the average requested time of jobs, and bars which depict the average elapsed time of jobs. Essentialy, the closer the bar is to the line, without intersecting the line, the more accurate the time predicition. If the bar intersects the line, then it may indicate the there was not enough time requested for a job to complete, but remember that these values are averages.

walltime_acc_final_zoom.png

One can also view more detailed information about these jobs by clicking a data point and using the Show raw data option.

wall_acc_select_datapoint.png

In order to have the Show raw data option, one may need to use the Drilldown option first to sort the jobs in that list by use or another metric.

wall_acc_show_raw_data.png

Supercomputer: 

HOWTO: Add Python packages using the conda package manager

While our Python installations come with many popular packages installed, you may come upon a case in which you need an additional package that is not installed. If the specific package you are looking for is available from anaconda.org (formerlly binstar.org), you can easily install it and required dependencies by using the conda package manager.

Procedure

The following steps are an example of how to set up a Python environment and install packages to a local directory using conda. We use the name  local for the environment, but you may use any other name.

Load proper Python module

We have python and miniconda3 modules. python modules are based on Anaconda package manager, and miniconda3 module is based on Miniconda package manager. python modules are typically recommended when you use Python in a standard environment that we provide. However, if you want to create your own python environment, we recommend using miniconda3 module, since you can start with minimal configurations.

module load miniconda3

Create Python installation to local directory

Three alternative create commands are listed. These cover the most common cases.

Create New Environment

The following will create a minimal Python installation without any extraneous packages:

conda create -n local
Clone base environment

If you want to clone the full base Python environment from the system, you may use the following create command:

conda create -n local --clone base
Create New Environment with specific packages

You can augment the command above by listing specific packages you would like installed into the environment. For example, the following will create a minimal Python installation with only the specified packages (in this case, numpy and babel):

conda create -n local numpy babel

By default, conda will install the newest versions of the packages it can find. Specific versions can be specified by adding =<version> after the package name. For example, the following will create a Python installation with Python version 2.7 and NumPy version 1.16:

conda create -n local python=2.7 numpy=1.16

To verify that a clone has been created, use the command

conda info -e

For additional conda command documentation see https://docs.conda.io/projects/conda/en/latest/commands.html#conda-general-commands

Activate environment

Before the created environment can be used, it must be activated.

For the bash shell:

source activate local
At the end of the conda create step, you may saw a message from the installer that you can use conda activate command for activating environment. But, please don't use conda activate command, because it will try to update your shell configuration file and it may cause other issues. So, please use source activate command as we suggest above.

On newer versions of Anaconda on the Owens cluster you may also need to perform the removal of the following packages before trying to install your specific packages:

conda remove conda-build
conda remove conda-env

Install packages

To install additional packages, use the conda install command. For example, to install the yt package:

conda install yt

By default, conda will install the newest version if the package that it can find. Specific versions can be specified by adding =<version> after the package name. For example, to install version 1.16 of the NumPy package:

conda install numpy=1.16

If you need to install packages with pip, then you can install pip in your virtual environment by

conda install pip

Then, you can install packages with pip as

pip install PACKAGE
If there are errors on this step you will need to resolve them before continuing.

Test Python package

Now we will test our installed Python package by loading it in Python and checking its location to ensure we are using the correct version. For example, to test that NumPy is installed correctly, run

python -c "from __future__ import print_function; import numpy; print(numpy.__file__)"

and verify that the output generally matches

$HOME/.conda/envs/local/lib/python3.6/site-packages/numpy/__init__.py

To test installations of other packages, replace all instances of numpy with the name of the package you installed.

Remember, you will need to load the proper version of Python before you go to use your newly installed package. Packages are only installed to one version of Python.

Install your own Python packages

If the method using conda above is not working, or if you prefer, you can consider installing Python packages from the source. Please read HOWTO: install your own Python packages.

But I use virtualenv and/or pip!

See the comparison to these package management tools here:

https://docs.conda.io/projects/conda/en/latest/commands.html#conda-vs-pip-vs-virtualenv-commands

Use pip only without conda package manager

pip installations are supported:

module load python
module list                            # check which python you just loaded
pip install --user --upgrade PACKAGE   # where PACKAGE is a valid package name

Note the default installation prefix is set to the system path where OSC users cannot install the package. With the option --user, the prefix is set to $HOME/.local where lib, bin, and other top-level folders for the installed packages are placed. Finally, the option --upgrade will upgrade the existing packages to the newest available version.

The one issue with this approach is portability with multiple Python modules. If you plan to stick with a single Python module, then this should not be an issue. However, if you commonly switch between different Python versions, then be aware of the potential trouble in using the same installation location for all Python versions.

Use pip in a Python virtual environment (Python 3 only)

Typically, you can install packages with the methods shown in Install packages section above, but in some cases where the conda package installations have no source from conda channels or have dependency issues, you may consider using pip in an isolated Python virtual environment. 

To create an isolated virtual environment:

module reset
python3 -m venv --without-pip $HOME/venv/mytest --prompt "local"
source $HOME/venv/mytest/bin/activate
(local) curl https://bootstrap.pypa.io/get-pip.py |python     # get the newest version of pip
(local) deactivate

where we use the path $HOME/venv/mytest and the name local for the environment, but you may use any other path and name. 

To activate and deactivate the virtual environment:

source $HOME/venv/mytest/bin/activate
(local) deactivate 

To install packages:

source $HOME/venv/mytest/bin/activate
(local) pip install PACKAGE 

You don't need the --user option within the virtual environment.  

Further Reading

Conda Test Drive: https://conda.io/docs/test-drive.html 

Supercomputer: 

HOWTO: Install Tensorflow locally

This documentation describes how to install tensorflow package locally in your $HOME space.

Load python module

module load python/3.6-conda5.2

Clone python installation to local directory

Three alternative create commands are listed.  These cover the most common cases:

conda create -n local --clone="$PYTHON_HOME"

This will clone the entire python installation to ~/envs/local directory. The process will take several minutes.

conda create -n local

This will create a local python installation without any packages. If you need a small number of packages, you may choose this option.

conda create -n local python={version} anaconda

If you like to install a specific version of python, you can specify it with "python" option. For example, you can use "python=3.6" for version 3.6.

To verify that a clone has been created, use the command

conda info -e

For additional conda command documentation see https://conda.io/docs/commands.html

Activate clone environment

For the bash shell:

source activate local

On newer versions of Anaconda on the Owens cluster you may also need to perform the removal of the following packages before trying to install your specific packages:

conda remove conda-build
conda remove conda-env

Install package

Install the latest version of tensorflow that is gpu compatible.

pip install tensorflow-gpu
If there are errors on this step you will need to resolve them before continuing.

Test python package

Now we will test tensorflow package by loading it in python and checking its location to ensure we are using the correct version.

python -c "import tensorflow;print (tensorflow.__file__)"

Output:

$HOME/.conda/envs/local/lib/python2.7/site-packages/tensorflow/__init__.py
Remember, you will need to load the proper version of python before you go to use your newly installed package.  Packages are only installed to one version of python.

Install your own python modules

If the method using conda above is not working or if you prefer, you can consider installing python modules from the source. Please read HOWTO: install your own python modules.

 

Supercomputer: 

HOWTO: Install your own Python packages

While we provide a number of Python packages, you may need a package we do not provide. If it is a commonly used package or one that is particularly difficult to compile, you can contact OSC Help for assistance. We also have provided an example below showing how to build and install your own Python packages and make them available inside of Python. These instructions use "bash" shell syntax, which is our default shell. If you are using something else (csh, tcsh, etc), some of the syntax may be different.

Please consider using conda Python package manager before you try to build Python using the method explained here. We have instructions on conda here.

Gather your materials

First, you need to collect what you need in order to perform the installation. We will do all of our work in $HOME/local/src. You should make this directory now.

mkdir -p $HOME/local/src

Next, we will need to download the source code for the package we want to install. In our example, we will use "NumExpr," a package we already provide in the system version of Python. You can either download the file to your desktop and then upload it to OSC, or directly download it using the wget utility (if you know the URL for the file).

cd ~/local/src
wget http://numexpr.googlecode.com/files/numexpr-2.0.1.tar.gz

Next, extract the downloaded file. In this case, since it's a "tar.gz" format, we can use tar to decompress and extract the contents.

tar xvfz numexpr-2.0.1.tar.gz

You can delete the downloaded archive now or keep it should you want to start the installation from scratch.

Build it!

Environment

To build the package, we will want to first create a temporary environment variable to aid in installation. We'll call INSTALL_DIR.

export INSTALL_DIR=${HOME}/local/numexpr/2.0.1

We are roughly following the convention we use at the system level. This allows us to easily install new versions of software without risking breaking anything that uses older versions. We have specified a folder for the program (numexpr), and for the version (2.0.1). To be consistent with Python installations, we will create a second temporary environment variable that will contain the actual installation location.

export TREE=${INSTALL_DIR}/lib/python2.7/site-packages

Next, make the directory tree.

mkdir -p $TREE

Compile

To compile the package, we should switch to the GNU compilers. The system installation of Python was compiled with the GNU compilers, and this will help avoid any unnecessary complications. We will also load the Python package, if it hasn't already been loaded.

module swap intel gnu
module load python

Next, build it. This step may vary a bit, depending on the package you are compiling. You can execute python setup.py --help to see what options are available. Since we are overriding the install path to one that we can write to and that fits our management plan, we need to use the --prefix option.

python setup.py install --prefix=$INSTALL_DIR

Make it usable

At this point, the package is compiled and installed in ~/local/numexpr/2.0.1/lib/python2.7/site-packages. Occasionally, some files will be installed in ~/local/numexpr/2.0.1/bin as well. To ensure Python can locate these files, we need to modify our environment.

Manual

The most immediate way -- but the one that must be repeated every time you wish to use the package -- is to manually modify your environment. If files are installed in the "bin" directory, you'll need to add it to your path. As before, these examples are for bash, and may have to be modified for other shells. Also, you will have to modify the directories to match your install location.

export PATH=$PATH:~/local/numexpr/2.0.1/bin

And for the Python libraries:

export PYTHONPATH=$PYTHONPATH:~/local/numexpr/2.0.1/lib/python2.7/site-packages

Hardcode it

We don't recommend this option, as it is less flexible and can cause conflicts with system software. But if you want, you can modify your .bashrc (or similar file, depending on your shell) to set these environment variables automatically. Be extra careful; making a mistake in .bashrc (or similar) can destroy your login environment in a way that will require a system administrator to fix. To do this, you can copy the lines above modifying $PATH and $PYTHONPATH into .bashrc. Remember to test them interactively first. If you destroy your shell interactively, the fix is as simple as logging out and then logging back in. If you break your login environment, you'll have to get our help to fix it.

Make a module (recommended!)

This is the most complicated option, but it is also the most flexible, as you can have multiple versions of this particular software installed and specify at run-time which one to use. This is incredibly useful if a major feature changes that would break old code, for example. You can see our tutorial on writing modules here, but the important variables to modify are, again, $PATH and $PYTHONPATH. You should specify the complete path to your home directory here and not rely on any shortcuts like ~ or $HOME.  Below is a modulefile written in Lua:

If you are following the tutorial on writing modules, you will want to place this file in $HOME/local/share/modulefiles/numexpr/2.0.1.lua:

-- This is a Lua modulefile, this file 2.0.1.lua can be located anywhere
-- But if you are following a local modulefile location convention, we place them in
-- $HOME/local/share/modulefiles/
-- For numexpr we place it in $HOME/local/share/modulefiles/numexpr/2.0.1.lua
-- This finds your home directory
local homedir = os.getenv("HOME")
prepend_path("PYTHONPATH", 
pathJoin(homedir, "/local/numexpr/2.0.1/lib/python2.7/site-packages"))
prepend_path(homedir, "local/numexpr/2.0.1/bin"))

Once your module is created (again, see the guide), you can use your Python package simply by loading the software module you created.

module use $HOME/local/share/modulefiles/
module load numexpr/2.0.1
Supercomputer: 
Service: 

HOWTO: Collect performance data for your program

This page outlines ways to generate and view performance data for your program using tools available at OSC.

Intel Tools

This section describes how to use performance tools from Intel. Make sure that you have an Intel module loaded to use these tools.

Intel VTune

Intel VTune is a tool to generate profile data for your application. Generating profile data with Intel VTune typically involves three steps:

1. Prepare the executable for profiling.

You need executables with debugging information to view source code line detail: re-compile your code with a -g  option added among the other appropriate compiler options. For example:

mpicc wave.c -o wave -g -O3

2. Run your code to produce the profile data.

Profiles are normally generated in a batch job. To generate a VTune profile for an MPI program:

mpiexec <mpi args> amplxe-cl <vtune args> <program> <program args>

where <mpi args> represents arguments to be passed to mpiexec, <program> is the executable to be run, <vtune args> represents arguments to be passed to the VTune executable amplxe-cl, and <program args> represents arguments passed to your program.

For example, if you normally run your program with mpiexec -n 12 wave_c, you would use

mpiexec -n 12 amplxe-cl -collect hotspots -result-dir r001hs wave_c

To profile a non-MPI program:

amplxe-cl <vtune args> <program> <program args>

The profile data is saved in a .map file in your current directory.

As a result of this step, a subdirectory that contains the profile data files is created in your current directory. The  subdirectory name is based on the -result-dir argument and the node id, for example, r001hs.o0674.ten.osc.edu.

3. Analyze your profile data.

You can open the profile data using the VTune GUI in interactive mode. For example:

amplxe-gui r001hs.o0674.ten.osc.edu

One should use an OnDemand VDI (Virtual Desktop Interface) or have X11 forwarding enabled (see Setting up X Windows). Note that X11 forwarding can be distractingly slow for interactive applications.

Intel ITAC

Intel Trace Analyzer and Collector (ITAC) is a tool to generate trace data for your application. Generating trace data with Intel ITAC typically involves three steps:

1. Prepare the executable for tracing.

You need to compile your executbale with -tcollect  option added among the other appropriate compiler options to insert instrumentation probes calling the ITAC API. For example:

mpicc wave.c -o wave -tcollect -O3

2. Run your code to produce the trace data.

mpiexec -trace <mpi args> <program> <program args>

For example, if you normally run your program with mpiexec -n 12 wave_c, you would use

mpiexec -trace -n 12 wave_c

As a result of this step, .anc, .f, .msg, .dcl, .stf, and .proc files will be generated in your current directory.

3. Analyze the trace data files using Trace Analyzer

You will need to use traceanalyzer to view the trace data. To open Trace Analyzer:

traceanalyzer /path/to/<stf file>

where the base name of the .stf file will be the name of your executable.

One should use an OnDemand VDI (Virtual Desktop Interface) or have X11 forwarding enabled (see Setting up X Windows) to view the trace data. Note that X11 forwarding can be distractingly slow for interactive applications.

Intel APS

Intel's Application Performance Snapshot (APS) is a tool that provides a summary of your application's performance . Profiling HPC software with Intel APS typically involves four steps:

1. Prepare the executable for profiling.

Regular executables can be profiled with Intel APS. but source code line detail will not be available. You need executables with debugging information to view source code line detail: re-compile your code with a -g  option added among the other approriate compiler options. For example:

mpicc wave.c -o wave -tcollect -O3

2. Run your code to produce the profile data directory.

Profiles are normally generated in a batch job. To generate profile data for an MPI program:

mpiexec -trace <mpi args> <program> <program args>

where <mpi args> represents arguments to be passed to mpiexec, <program> is the executable to be run and <program args> represents arguments passed to your program.

For example, if you normally run your program with mpiexec -n 12 wave_c, you would use

mpiexec -n 12 wave_c

To profile a non-MPI program:

aps <program> <program args>

The profile data is saved in a subdirectory in your current directory. The directory name is based on the date and time, for example, aps_result_YYYYMMDD/

3. Generate the profile file from the directory.

To generate the html profile file from the result subdirectory:

aps --report=./aps_result_YYYYMMDD

to create the file aps_report_YYYYMMDD_HHMMSS.html.

4. Analyze the profile data file.

You can open the profile data file using a web browswer on your local desktop computer. This option typically offers the best performance.

ARM Tools

This section describes how to use performance tools from ARM.

ARM MAP

Instructions for how to use MAP is available here.

ARM DDT

Instructions for how to use DDT is available here.

ARM Performance Reports

Instructions for how to use Performance Reports is available here.

Other Tools

This section describes how to use other performance tools.

HPC Toolkit

Rice University's HPC Toolkit is a collection of performance tools. Instructions for how to use it at OSC is available here.

TAU Commander

TAU Commander is a user interface for University of Oregon's TAU Performance System. Instructions for how to use it at OSC is available here.

Supercomputer: 
Service: 

HOWTO: Connect to OSC services using OSC Connect

 


NOTE:  OSC Ondemand now provides in-browser connectivity via VNC.  OSC Connect is no longer required nor recommended.  We recommend contacting OSC Help if you need assistance using the in-browser capabilities or if there is other functionality that is forcing you to still use OSC Connect.  This page is still available primarily for reference.

To connect to OSC services, a secure tunnel to a session is required. This can be done relatively simply in OSX and Linux by using the SSH functionality built into the system, but Windows users have had to configure and use third party applications like PuTTY or Java to access secure resources at OSC. OSC Connect is a native windows application written in C# and compiled for .NET 2.0, providing preconfigured management of secure tunnel connections for Windows users, as well as providing a launcher for secure file transfer, VNC, terminal, and web based services. 

OSC Connect is supported on Windows versions from Windows XP through Windows 10.

We've created a brief video:

 

 

Getting Started

  1. Download the latest release of OSC Connect from here: Latest Build
    • Click OSCConnect.exe  and download. Use "Save link as" to download this file to a folder of your choice. 
    • See here for all the previous releases. 
  2. Double-Click the OSCConnect.exe icon to run the application. In the current state, OSC Connect is entirely deployed by a single executable file; no further installation is required.   ​
  3. When you run OSC Connect, a temporary folder named "ConnectFiles" with four additional files will be created. These are required for proper functionality of the application. Please ensure that these files are permitted by your IT administrator:
    • plink.exe is the command-line version of PuTTY used by the application to create the secure connection to OSC resources.
    • putty.exe  is the GUI application of PuTTY used to provide terminal emulation remote console connections to OSC resources.
    • vncviewer.exe is the VNC viewer client used to view a remote desktop session.
    • WinSCP.exe  is the SFTP client used for file transfer.
  4. Once your connections to OSC services as well as the OSC Connect app is closed, the temporary folder named "ConnectFiles" will be removed automatically. 

OSC Connect Graphical User Interface

After you double-click the  OSCConnect.exe  icon oscconnect_icon , the application graphical user interface is shown as below:

 interface

  • Network Status: it indicates which OSC cluster you will be connected to. The option can be changed in "Settings". 

  • Settings: it provides several configuration options to modify the behavior of the application. 

    • Connection Settings: use this dropdown to select the default host/cluster. Selecting a server here will change the endpoint for tunneling, sftp connections, console connections, and connectivity checking.

    • System Settings

      • Detect Clipboard Activity: when this option is enabled, the application will detect valid data on the Windows clipboard and populate the application. ( Default: Off )

      • Check for New Versions: when this option is enabled, the application will check for version updates. (Default: on)

    • Automation Settings:

      • Save User Credentials: when this option is enabled, it allows the application to remember the user when the application is reopened. This saves the user credentials to the user settings using DPAPI Encryption. Passwords are decrypted only by current the Windows user account. ( Default: Off )

      • Launch Tunnel On Import: when this option is enabled, the tunnel will automatically connect when the application detects a valid clipboard string and the user credentials have been entered. ( Default: On )

    • VNC Settings

  • ​About: it provides additional information about OSC Connect.

After you provide your OSC Credentials, i.e. your OSC HPC username and password, more functionalities are available as shown below:

gui2

  • SFTP: it uses WinSCP as the SFTP client to securely connect to the OSC file system over SFTP.
  • Console: it uses PuTTY to provides you shell access to OSC cluster.

In addition, Session Type is provided such that advanced users can connect to a running session mannually.

For most users, you can ingore Session Type and are able to connect to a running session through OSC OnDemand using the configuration generated after your session starts. See "Connecting to a Session" for more details. 

Connecting to a Session

The OSC Connect application can be used to connect to a running session launched through OSC OnDemand.

OSC Connect does not have the ability to launch a new session
  1. Navigate to OSC Demand to create a new session under either "Desktops" or "Desktop Apps". 
    • More information on how to use OSC OnDemand can be found here
  2. Once your session is running, you have two options to connect to the session: "noVNC Instructions" and "Native Client Instructions". Click "Native Client Instructions" to use OSC Connect. 
  3. Follow the steps below to connect to the running session:
    • Step 1: Download the latest OSC Connect if you don't already have it. See "Getting Started" for more details. 
    • Step 2: Launch OSC Connect. 
    • Step 3: Click - osc://xxxxx . osc://xxxxx is a custom URI scheme that is registered when you launch the application. Simply click the link to populate the configuration information and connect to your running session. If OSCConnect.exe is not running when you click the URI, the OSC Connect application will be popped up. Enter your OSC HPC username and password and you will be able to connect to the session by clicking the "Connect" button.
    • For advanced users, you can skip step 3 and connect to a running session mannually. Depending on the type of application, you can choose either "iHPC VNC Desktop" to create a VNC tunnel or "Web Browser Session" to create a browser tunnel. 
Be sure to launch OSCConnect.exe at least once before you use it to connect to a running session. The initial launch will add a key to your user registry that initializes the URI scheme.  

FAQ

I've clicked the osc:// link and nothing happened.

Be sure to run OSCConnect.exe at least once. The initial launch will add a key to your user registry that initializes the URI scheme. If you move or rename the OSCConnect.exe file, you will need to run the application again manually to update the path in the handler.

I've received the error "Unable to open helper application. The protocol specified in this address is not valid."

This issue appears in some earlier versions of Internet Explorer when attempting to launch the application from a Temporary location. Download and run the OSCConnect.exe application, being sure to save the file to a non-temporary location.

 

 

Supercomputer: 
Service: 

HOWTO: Debugging Tips

This article focuses on debugging strategies for C/C++ codes, but many are applicable to other languages as well.

Rubber Duck Debugging

This approach is a great starting point. Say you have written some code, and it does not do what you expect it to do. You have stared at it for a few minutes, but you cannot seem to spot the problem.

Try explaining what the problem is to a rubber duck. Then, walk the rubber duck through your code, line by line, telling it what it does. Don’t have a rubber duck? Any inanimate object will do (or even an animate one if you can grab a friend).

It sounds silly, but rubber duck debugging helps you to get out of your head, and hopefully look at your code from a new perspective. Saying what your code does (or is supposed to do) out loud has a good chance of revealing where your understanding might not be as good as you think it is.

Printf() Debugging

You’ve written a whole bunch of new code. It takes some inputs, chugs along for a while, and then creates some outputs. Somewhere along this process, something goes wrong. You know this because the output is not at all what you expected. Unfortunately, you have no idea where things are going wrong in the code.

This might be a good time to try out printf() debugging. It’s as simple as its name implies: simply add (more) printf() statements to your code. You’ve likely seen this being used. It’s the name given to the infamous ‘printf(“here”);’ calls used to verify that a particular codepath is indeed taken.

Consider printing out arguments and return values to key functions. Or, the results or summary statistics from large calculations. These values can be used as “sanity checks” to ensure that up until that point in the code, everything is going as expected.

Assertion calls, such as "assert(...)", can also be used for a similar purpose. However, often the positive feedback you get from print statements is helpful in when you’re debugging. Seeing a valid result printed in standard out or a log file tells you positively that at least something is working correctly.

Debuggers

Debuggers are tools that can be used to interactively (or with scripts) debug your code. A fairly common debugger for C and C++ codes is gdb. Many guides exist online for using gdb with your code.

OSC systems also provide the ARM DDT debugger. This debugger is designed for use with HPC codes and is arguably easier to use than gdb. It can be used to debug MPI programs as well.

Debuggers allow you to interact with the program while it is running. You can do things like read and write variable values, or check to see if/when certain functions are called.

Testing

Okay, this one isn’t exactly a debugging strategy. It’s a method to catch bugs early, and even prevent the addition of bugs. Writing a test suite for your code that’s easy to run (and ideally fast) lets you test new changes to ensure they don’t break existing functionality.

There are lots of different philosophies on testing software. Too many to cover here. Here’s two concepts that are worth looking into: unit testing and system testing.

The idea behind unit testing is writing tests for small “units” of code. These are often functions or classes. If you know that the small pieces that make up your code work, then you’ll have more confidence in the overall assembled program. There’s an added architecture benefit here too. Writing code that is testable in the first place often results in code that’s broken up into separate logical pieces (google “separation of concerns”). This makes your code more modular and less “spaghetti-like”. Your code will be easier to modify and understand.

The second concept – system testing – involves writing tests that run your entire program. These often take longer than unit tests, but have the added benefit that they’ll let you know whether or not your entire program still works after introducing a new change.

When writing tests (both system and unit tests), it’s often helpful to include a couple different inputs. Occasionally a program may work just fine for one input, but fail horribly with another input.

Minimal, Reproducible Example

Maybe your code takes a couple hours (or longer…) to run. There’s a bug in it, but every time you try to fix it, you have to wait a few hours to see if the fix worked. This is driving you crazy.

A possible approach to make your life easier is to try to make a Minimal, Reproducible Example (see this stackoverflow page for information).

Try to extract just the code that fails, from your program, and also its inputs. Wrap this up into a separate program. This allows you to run just the code that failed, hopefully greatly reducing the time it takes to test out fixes to the problem.

Once you have this example, can you make it smaller? Maybe take out some code that’s not needed to reproduce the bug, or shrink the input even further? Doing this might help you solve the problem.

Tools and other resources

  • Compiler warnings – compilers are your friend. Chances are your compiler has a flag that can be used to enable more warnings than are on by default. GNU tools have “-Wall” and “-Wextra”. These can be used to instruct the compiler to tell you about places in the code where bugs may exist.
  • The Practice of Programming by Brian Kernighan and Rob Pike contains a very good chapter on debugging C and C++ programs.
  • Valgrind is a tool that can be used for many types of debugging including looking for memory corruptions and leaks. However, it slows down your code a very sizeable amount. This might not be feasible for HPC codes
  • ASAN (address sanitizer) is another tool that can be used for memory debugging. It is less featureful than Valgrind, but runs much quicker, and so will likely work with your HPC code.

 

Supercomputer: 
Service: 

HOWTO: Establish durable SSH connections

In December 2021 OSC updated its firewall to enhance security. As a result, SSH sessions are being closed more quickly than they used to be. It is very easy to modify your SSH options in the client you use to connect to OSC to keep your connection open.

In ~/.ssh/config (use the command touch ~/.ssh/config to create it if there is no exisitng one), you can set 3 options:

TCPKeepAlive=no
ServerAliveInterval=60
ServerAliveCountMax=5

Please refer to your SSH client documentation for how to set these options in your client.

Service: 

HOWTO: Identify users on a project account and check status

An eligible principal investigator (PI) heads a project account and can authorize/remove user accounts under the project account (please check our Allocations and Accounts documentation for more details). This document shows you how to identify users on a project account and check the status of each user. 

Identify Users on a Project Account

If you know the project acccount

If the project account (projectID) is known, the OSCgetent command will list all users on the project:

$ OSCgetent group projectID

The returned information is in the format of:

projectID:*:gid: list of user IDs

gid is the group identifier number unique for the project account projectID. 

For example, the command OSCgetent group PZS0712 lists all users on the project account PZS0712 as below:

$ OSCgetent group PZS0712
PZS0712:*:5513:amarcum,guilfoos,hhamblin,kcahill,xwang

Multiple groups can also be queried at once.

For Example, the command OSCgetent group PZS0712 PZS0726 lists all users on both PZS0712 and PZS0726:

PZS0712:*:5513:amarcum,guilfoos,hhamblin,kcahill,xwang
PZS0726:*:6129:amarcum,kkappel

Details on a project can also be obtained along with the user list using the OSCfinger command.

$ OSCfinger -g projectID 

This returns:

Group: projectID                                  GID: XXXX
Status: 'active/restricted/etc'                   Type: XX
Principal Investigator: 'PI email'                Admins: NA
Members: 'list of users'
Category: NA
Institution: 'affliated institution'
Description: 'short description'
---

If you don't know the project acccount, but know the username

If the project account is not known, but the username is known, use the  OSCfinger command to list all of the groups the user belongs to:

OSCfinger username

The returned information is in the format of:

Login: username                                   Name: First Last
Directory: home directory path                    Shell: /bin/bash
E-mail: user's email address
Primary Group: user's primary project
Groups: list of projects and other groups user is in
Password Changed: date password was last changed  Password Expires: date password expires
Login Disabled: TRUE/FALSE                             Password Expired: TRUE/FALSE
Current Logins:
Displays if user is currently logged in and from where/when

For example, with the username as amarcum, the command OSCfinger amarcum returns the information as below:

$ OSCfinger amarcum
Login: amarcum                                    Name: Antonio Marcum
Directory: /users/PZS0712/amarcum                 Shell: /bin/bash
E-mail: amarcum@osc.edu
Primary Group: PZS0712
Groups: sts,ruby,l2supprt,oscall,clntstf,oscstaff,clntall,PZS0712,PZS0726
Password Changed: May 12 2019 15:47 (calculated)  Password Expires: Aug 11 2019 12:05 AM
Login Disabled: FALSE                             Password Expired: FALSE
Current Logins:
On since Mar 07 2019 12:12 on pts/14 from pitzer-login01.hpc.osc.edu
----

If you don't know either the project account or user account

If the project account or username is not known, use the OSCfinger -e command with the '-e' flag to get the user account based on the user's name.

Use the following command to list all of the user accounts associated with a First and Last name:

$ OSCfinger -e 'First Last'

For example, with user's first name as Summer and last name as Wang, the command

OSCfinger -e 'Summer Wang' returns the information as below:

$ OSCfinger -e 'Summer Wang'
Login: xwang                                      Name: Summer Wang
Directory: /users/oscgen/xwang                    Shell: /bin/bash
E-mail: xwang@osc.edu
Primary Group: PZS0712
Groups: amber,abaqus,GaussC,comsol,foampro,sts,awsmdev,awesim,ruby,matlab,aasheats,mars,ansysflu,wrigley,lgfuel,l2supprt,fsl,oscall,clntstf,oscstaff,singadm,clntall,dhgremot,fsurfer,PZS0530,PCON0003,PZS0680,PMIU0149,PZS0712,PAS1448
Password Changed: Jan 08 2019 11:41               Password Expires: Jul 08 2019 12:05 AM
Login Disabled: FALSE                             Password Expired: FALSE
---

Once you know the user account username, follow the discussions in the previous section identify users on a project to get all user accounts on the project. Please contact OSC Help if you have any questions. 

Check the Status of a User

Use the OSCfinger command to check the status of a user account as below:

OSCfinger username

For example, if the username is xwang, the command  OSCfinger xwang will return:

$ OSCfinger xwang
Login: xwang                                      Name: Summer Wang
Directory: /users/oscgen/xwang                    Shell: /bin/bash
E-mail: xwang@osc.edu
Primary Group: PZS0712
Groups: amber,abaqus,GaussC,comsol,foampro,sts,awsmdev,awesim,ruby,matlab,aasheats,mars,ansysflu,wrigley,lgfuel,l2supprt,fsl,oscall,clntstf,oscstaff,singadm,clntall,dhgremot,fsurfer,PZS0530,PCON0003,PZS0680,PMIU0149,PZS0712,PAS1448
Password Changed: Jan 08 2019 11:41               Password Expires: Jul 08 2019 12:05 AM
Login Disabled: FALSE                             Password Expired: FALSE
---
  • The home directory of xwang is   Directory: /users/oscgen/xwang
  • The shell of xwang is bash (Shell: /bin/bash). If the information is Shell:/access/denied, it means this user account has been either archived or restricted. Please contact OSC Help if you'd like to reactivate this user account.
  • xwang@osc.edu is the associated email with the user account xwang; that is, all OSC emails related to the account xwang will be sent to xwang@osc.edu (Mail forwarded to xwang@osc.edu). Please contact OSC Help if the email address associated with this user account has been changed to ensure important notifications/messages/reminders from OSC may be received in a timely manner.

Check the Usage and Quota of a User's Home Directory/Project Space

All users see their file system usage statistics when logging in, like so:

As of 2018-01-25T04:02:23.749853 userid userID on /users/projectID used XGB of quota 500GB and Y files of quota 1000000 files

The information is from the file /users/reporting/storage/quota/*_quota.txt , which is updated twice a day. Some users may see multiple lines associated with a username, as well as information on project space usage and quota of their Primary project, if there is one. The usage and quota of the home directory of a username is provided by the line including the file server your home directory is on (for more information, please visit Home Directories), while others (generated due to file copy) can be safely ignored. 

You can check any user's home directory or a project's project space usage and quota by running:

grep -h 'userID' OR 'projectID' /users/reporting/storage/quota/*_quota.txt

Here is an example of project PZS0712:

$ grep -h PZS0712 /users/reporting/storage/quota/*_quota.txt
As of 2019-03-07T13:55:01.000000 project/group PZS0712 on /fs/project used 262 GiB of quota 2048 GiB and 166987 files of quota 200000 files
As of 2019-03-07T13:55:01.000000 userid xwang on /fs/project/PZS0712 used 0 GiB of quota 0 GiB and 21 files of quota 0 files
As of 2019-03-07T13:55:01.000000 userid dheisterberg on /fs/project/PZS0712 used 262 GiB of quota 0 GiB and 166961 files of quota 0 files
As of 2019-03-07T13:55:01.000000 userid amarcum on /fs/project/PZS0712 used 0 GiB of quota 0 GiB and 2 files of quota 0 files
As of 2019-03-07T13:55:01.000000 userid root on /fs/project/PZS0712 used 0 GiB of quota 0 GiB and 2 files of quota 0 files
As of 2019-03-07T13:55:01.000000 userid guilfoos on /fs/project/PZS0712 used 0 GiB of quota 0 GiB and 1 files of quota 0 files
As of 2019-03-07T13:51:23.000000 userid amarcum on /users/PZS0712 used 399.86 MiB of quota 500 GiB and 8710 files of quota 1000000 files

Here is an example for username amarcum:

$ grep -h amarcum /users/reporting/storage/quota/*_quota.txt
As of 2019-03-07T13:55:01.000000 userid amarcum on /fs/project/PZS0712 used 0 GiB of quota 0 GiB and 2 files of quota 0 files
As of 2019-03-07T13:56:39.000000 userid amarcum on /users/PZS0645 used 4.00 KiB of quota 500 GiB and 1 files of quota 1000000 files
As of 2019-03-07T13:56:39.000000 userid amarcum on /users/PZS0712 used 399.86 MiB of quota 500 GiB and 8710 files of quota 1000000 files

Check the RU Usage for Projects and Users

The OSCusage commnad can provide detailed information about computational usage for a given project and user.

See the OSCusage command page for details.

Supercomputer: 
Service: 

HOWTO: Install a MATLAB toolbox

If you need to use a MATLAB toolbox that is not provided through our installations. You can follow these instructions, and if you have any difficulties you can contact OSC Help for assistance.

A reminder: It is your responsibility to verify that your use of software packages on OSC’s systems including any 3rd party toolboxes (whether installed by OSC staff or by yourself)  complies with the packages’ license terms.

Gather your materials

First, we recommend making a new directory within your home directory in order to keep everything organized. You can use the unix command to make a new directory: "mkdir"

Now you can download the toolbox either to your desktop, and then upload it to OSC, or directly download it using the "wget" utility (if you know the URL for the file).

Now you can extract the downloaded file.

Adding the path

There are two methods on how to add the MATLAB toolbox path.

Method 1: Load up the Matlab GUI and click on "Set Path" and "Add folder"

Method 2: Use the "addpath" fuction in your script. More information on the function can be found here: https://www.mathworks.com/help/matlab/ref/addpath.html

Running the toolbox

Please refer to the instructions given alongside the toolbox. They should contain instructions on how to run the toolbox.

Supercomputer: 
Service: 
Technologies: 
Fields of Science: 

HOWTO: Install your own Perl modules

While we provide a number of Perl modules, you may need a module we do not provide. If it is a commonly used module, or one that is particularly difficult to compile, you can contact OSC Help for assistance, but we have provided an example below showing how to build and install your own Perl modules. Note, these instructions use "bash" shell syntax; this is our default shell, but if you are using something else (csh, tcsh, etc), some of the syntax may be different.

CPAN Minus

CPAN, the Comprehensive Perl Achive Network, is the primary source for publishing and fetching the latest modules and libraries for the Perl programming language. The default method for installing Perl modules using the "CPAN Shell", provides users with a great deal of power and flexibility but at the cost of a complex configuration and inelegant default setup.

 

Setting Up CPAN Minus

To use CPAN Minus, we must first load it, if it hasn't already been loaded. Note that this is not necessary if you loaded a version of Perl with the module load command.

module load cpanminus

Next, in order to use cpanminus, you will need to run the following command only ONCE:

perl -I $CPANMINUS_INC -Mlocal::lib

 

Using CPAN Minus

In most cases, using CPAN Minus to install modules is as simple as issuing a command in the following form:

cpanm [Module::Name]

For example, below are three examples of installing perl modules:

cpanm Math::CDF
cpanm SET::IntervalTree
cpanm DB_File

 

Testing Perl Modules

To test a perl module import, here are some examples below:

perl -e "require Math::CDF"
perl -e "require Set::IntervallTree"
perl -e "require DB_File"

The modules are installed correctly if no output is printed.

What Local Modules are Installed in my Account?

To show the local modules you have installed in your user account:

perldoc perllocal

Reseting Module Collection

If you should ever want to start over with your perl module collection, delete the following folders:

rm -r ~/perl5 
rm -r ~/.cpanm

 

Supercomputer: 
Service: 

HOWTO: Locally Installing Software

Sometimes the best way to get access to a piece of software on the HPC systems is to install it yourself as a "local install". This document will walk you through the OSC-recommended procedure for maintaining local installs in your home directory or project space. The majority of this document describes the process of "manually" building and installing your software. We also show a partially automated approach through the use of a bash script in the Install Script section near the end.

NOTE: Throughout this document we'll assume you're installing into your home directory, but you can follow the steps below in any directory for which you have read/write permissions.
This document assumes you are familiar with the process of building software using "configure" or via editing makefiles, and only provides best practices for installing in your home directory.

Getting Started

Before installing your software, you should first prepare a place for it to live. We recommend the following directory structure, which you should create in the top-level of your home directory:

    local
    |-- src
    |-- share
        `-- lmodfiles

This structure is analogous to how OSC organizes the software we provide. Each directory serves a specific purpose:

  • local - Gathers all the files related to your local installs into one directory, rather than cluttering your home directory. Applications will be installed into this directory with the format "appname/version". This allows you to easily store multiple versions of a particular software install if necessary.
  • local/src - Stores the installers -- generally source directories -- for your software. Also, stores the compressed archives ("tarballs") of your installers; useful if you want to reinstall later using different build options.
  • local/share/lmodfiles - The standard place to store module files, which will allow you to dynamically add or remove locally installed applications from your environment.

You can create this structure with one command:

    mkdir -p $HOME/local/src $HOME/local/share/lmodfiles

(NOTE:  $HOME is defined by the shell as the full path of your home directory. You can view it from the command line with the command echo $HOME.)

Installing Software

Now that you have your directory structure created, you can install your software. For demonstration purposes, we will install a local copy of Git.

First, we need to get the source code onto the HPC filesystem. The easiest thing to do is find a download link, copy it, and use the wget tool to download it on the HPC. We'll download this into $HOME/local/src:

    cd $HOME/local/src
    wget https://github.com/git/git/archive/v2.9.0.tar.gz

Now extract the tar file:

    tar zxvf v2.9.0.tar.gz

Next, we'll go into the source directory and build the program. Consult your application's documentation to determine how to install into $HOME/local/"software_name"/"version". Replace "software_name" with the software's name and "version" with the version you are installing, as demonstrated below. In this case, we'll use the configure tool's --prefix option to specify the install location.

You'll also want to specify a few variables to help make your application more compatible with our systems. We recommend specifying that you wish to use the Intel compilers and that you want to link the Intel libraries statically. This will prevent you from having to have the Intel module loaded in order to use your program. To accomplish this, add CC=icc CFLAGS=-static-intel to the end of your invocation of configure. If your application does not use configure, you can generally still set these variables somewhere in its Makefile or build script.

Then, we can build Git using the following commands:

    cd git-2.9.0
    autoconf # this creates the configure file
    ./configure --prefix=$HOME/local/git/2.9.0 CC=icc CFLAGS=-static-intel
    make && make install

Your application should now be fully installed. However, before you can use it you will need to add the installation's directories to your path. To do this, you will need to create a module.

Creating a Module

Modules allow you to dynamically alter your environment to define environment variables and bring executables, libraries, and other features into your shell's search paths.

Automatically create a module

We can use the mkmod script to create a simple Lua module for the Git installation:

module load mkmod
create_module.sh git 2.9.0 $HOME/local/git/2.9.0

It will create the module $HOME/local/share/lmodfiles/git/2.9.0.lua. Please note that by default our mkmod script only creates module files that define some basic environment variables PATHLD_LIBRARY_PATH, MANPATH, and GIT_HOME. These default variables may not cover all paths desired. We can overwrite these defaults in this way:

module load mkmod
TOPDIR_LDPATH_LIST="lib:lib64" \
TOPDIR_PATH_LIST="bin:exe" \
create_module.sh git 2.9.0 $HOME/local/git/2.9.0

This adds $GIT_HOME/bin, $GIT_HOME/exe to PATH and $GIT_HOME/lib , $GIT_HOME/lib64 to LD_LIBRARY_PATH.

We can also add other variables by using ENV1, ENV2, and more. For example, suppose we want to change the default editor to vim for Git:

module load mkmod
ENV1="GIT_EDITOR=vim" \
create_module.sh git 2.9.0 $HOME/local/git/2.9.0

Manually create a module 

We will be using the filename 2.9.0.lua ("version".lua). A simple Lua module for our Git installation would be:

-- Local Variables
local name = "git"
local version = "2.9.0"

-- Locate Home Directory
local homedir = os.getenv("HOME")
local root = pathJoin(homedir, "local", name, version)

-- Set Basic Paths
prepend_path("PATH", pathJoin(root, "bin"))
prepend_path("LD_LIBRARY_PATH", root .. "/lib")
prepend_path("LIBRARY_PATH", root .. "/lib")
prepend_path("INCLUDE", root .. "/include")
prepend_path("CPATH", root .. "/include")
prepend_path("PKG_CONFIG_PATH", root .. "/lib/pkgconfig")
prepend_path("MANPATH", root .. "/share/man")

NOTE: For future module files, copy our sample modulefile from ~support/doc/modules/sample_module.lua. This module file follows the recommended design patterns laid out above and includes samples of many common module operations

Our clusters use a Lua based module system. However, there is another module system based in TCL that will not be discussed in this HOWTO.
NOTE: TCL is cross-compatible and is converted to Lua when loaded. More documentation is available at https://www.tacc.utexas.edu/research-development/tacc-projects/lmod/ or by executing module help.

Initializing Modules

Any module file you create should be saved into your local lmodfiles directory ($HOME/local/share/lmodfiles). To prepare for future software installations, create a subdirectory within lmodfiles named after your software and add one module file to that directory for each version of the software installed.

In the case of our Git example, you should create the directory $HOME/local/share/lmodfiles/git and create a module file within that directory named 2.9.0.lua.

To make this module usable, you need to tell lmod where to look for it. You can do this by issuing the command module use $HOME/local/share/lmodfiles in our example. You can see this change by performing module avail. This will allow you to load your software using either module load git or module load git/2.9.0

NOTE: module use$HOME/local/share/lmodfiles and module load "software_name" need to be entered into the command line every time you enter a new session on the system.

If you install another version later on (lets say version 2.9.1) and want to create a module file for it, you need to make sure you call it 2.9.1.lua. When loading Git, lmod will automatically load the newer version. If you need to go back to an older version, you can do so by specifying the version you want: module load git/2.9.0.

To make sure you have the correct module file loaded, type which git which should emit "~/local/git/2.9.0/bin/git" (NOTE: ~ is equivalent to $HOME). 

To make sure the software was installed correctly and that the module is working, type git --version which should emit "git version 2.9.0".

Automating With Install Script

Simplified versions of the scripts used to manage the central OSC software installations are provided at ~support/share/install-script. The idea is that you provide the minimal commands needed to obtain, compile, and install the software (usually some variation on wget, tar, ./configure, make, and make install) in a script, which then sources an OSC-maintained template that provides all of the "boilerplate" commands to create and manage a directory structure similar to that outlined in the Getting Started section above. You can copy an example install script from ~support/share/install-script/install-osc_sample.sh and follow the notes in that script, as well as in ~support/share/install-script/README.md, to modify it to install software of your choosing.

NOTE: By default, the install script puts the module files in $HOME/osc_apps/lmodfiles, so you will need to run module use $HOME/osc_apps/lmodfiles and module load [software-name] every time you enter a new session on the system and want to use the software that you have installed.

Further Reading

For more information about modules, be sure to read the webpage indicated at the end of module help. If you have any questions about modules or local installations, feel free to contact the OSC Help Desk and oschelp@osc.edu.

Supercomputer: 
Service: 

HOWTO: Manage Access Control List (ACLs)

An ACL (access control list) is a list of permissions associated with a file or directory. These permissions allow you to restrict access to a certain file or directory by user or group. 

OSC supports NFSv4 ACL on our home directory and POSIX ACL on our project and scratch file systems. Please see the how to use NFSv4 ACL for home directory ACL management and how to use POSIX ACL for managing ACLs in project and scratch file systems. 

Supercomputer: 
Service: 

HOWTO: Use NFSv4 ACL

This document shows you how to use the NFSv4 ACL permissions system. An ACL (access control list) is a list of permissions associated with a file or directory. These permissions allow you to restrict access to a certian file or directory by user or group. NFSv4 ACLs provide more specific options than typical POSIX read/write/execute permissions used in most systems.

These commands are useful for managing ACLs in the dir locations of /users/<project-code>.

Understanding NFSv4 ACL

This is an example of an NFSv4 ACL

A::user@nfsdomain.org:rxtncy
A::alice@nfsdomain.org:rxtncy
A::alice@nfsdomain.org:rxtncy
A::alice@nfsdomain.org:rxtncy

The following sections will break down this example from left to right and provide more usage options

 

ACE Type

The 'A' in the example is known as the ACE (access control entry) type. The 'A' denotes "Allow" meaning this ACL is allowing the user or group to perform actions requiring permissions. Anything that is not explicitly allowed is denied by default.

Note: 'D' can denote a Deny ACE. While this is a valid option, this ACE type is not reccomended since any permission that is not explicity granted is automatically denied meaning Deny ACE's can be redundant and complicated.

 

ACE Flags

The above example could have a distinction known as a flag shown below

A:d:user@osc.edu:rxtncy

The 'd' used above is called an inheritence flag. This makes it so the ACL set on this directory will be automatically established on any new subdirectories. Inheritence flags only work on directories and not files. Multiple inheritence flags can be used in combonation or omitted entirely. Examples of inheritence flags are listed below:

Flag Name Function
d directory-inherit New subdirectories will have the same ACE
f file-inherit New files will have the same ACE minus the inheritence flags 
n no-propogate inherit New subdirectories will inherit the ACE minus the inheritence flags
i inherit-only New files and subdirectories will have this ACE but the ACE for the directory with the flag is null

 

ACE Principal

The 'user@nfsdomain.org' is a principal. The principle denotes the people the ACL is allowing access to. Principals can be the following:

  • A named user
  • Special principals
    • OWNER@
    • GROUP@
    • EVERYONE@
  • A group
    • Note: When the principal is a group, you need to add a group flag, 'g', as shown in the below example
    • A:g:group@osc.edu:rxtncy
      

 

ACE Permissions

The 'rxtncy' are the permissions the ACE is allowing. Permissions can be used in combonation with each other. A list of permissions and what they do can be found below:

Permission Function
r read-data (files) / list-directory (directories)
w write-data (files) / create-file (directories)
a append-data (files) / create-subdirectory (directories)
x execute (files) / change-directory (directories)
d delete the file/directory
D delete-child : remove a file or subdirectory from the given directory (directories only)
t read the attributes of the file/directory
T write the attribute of the file/directory
n read the named attributes of the file/directory
N write the named attributes of the file/directory
c read the file/directory ACL
C write the file/directory ACL
o change ownership of the file/directory

 

Note: Aliases such as 'R', 'W', and 'X' can be used as permissions. These work simlarly to POSIX Read/Write/Execute. More detail can be found below.

Alias Name Expansion
R Read rntcy
W Write watTNcCy (with D added to directory ACE's
X Execute xtcy

 

Using NFSv4 ACL

This section will show you how to set, modify, and view ACLs

 

Set and Modify ACLs

To set an ACE use this command:

nfs4_setfacl [OPTIONS] COMMAND file

To modify an ACE, use this command:

nfs4_editfacl [OPTIONS] file

Where file is the name of your file or directory. More information on Options and Commands can be found below.

 

Commands

Commands are only used when first setting an ACE. Commands and their uses are listed below.

COMMAND FUNCTION
-a acl_spec [index] add ACL entries in acl_spec at index (DEFAULT: 1)
-x acl_spec | index remove ACL entries or entry-at-index from ACL
-A file [index] read ACL entries to add from file
-X file  read ACL entries to remove from file
-s acl_spec set ACL to acl_spec (replaces existing ACL)
-S file read ACL entries to set from file
-m from_ace to_ace modify in-place: replace 'from_ace' with 'to_ace'

 

Options

Options can be used in combination or ommitted entirely. A list of options is shown below:

OPTION NAME FUNCTION
-R recursive Applies ACE to a directory's files and subdirectories
-L logical Used with -R, follows symbolic links
-P physical Used with -R, skips symbolic links
 

 

View ACLs

To view ACLs, use the following command:

nfs4_getfacl file

Where file is your file or directory

Use cases

Create a share folder for a specific group

First, make the top-level of home dir group executable.

nfs4_setfacl -a A:g:<group>@osc.edu:X $HOME
We make $HOME only executable so that the group can only traverse to the share folder which is created in the next steps, and view other folders in your home dir. Providing executable access lets one (user/group) go to that dir, but not read it's contents.

Next create a new folder to store shared data

mkdir share_group

Move all data to be shared that already exists to this folder

mv <src> ~/share_group

Apply the acl for all current files and dirs under ~/share_group, and set acl so that new files created there will automatically have proper group permissions

nfs4_setfacl -R -a A:dfg:<group>@osc.edu:RX ~/share_group

using an acl file

One can also specify the acl to be used in a single file, then apply that acl to avoid duplicate entries and keep the acl entries consistent.

$ cat << EOF > ~/group_acl.txt

A:fdg:clntstf@osc.edu:rxtncy
A::OWNER@:rwaDxtTnNcCy
A:g:GROUP@:tcy
A::EVERYONE@:rxtncy
EOF
$ nfs4_setfacl -R -S ~/group_acl.txt ~/share_group
Remember that any existing data moved into the share folder will retain its original permissions/acl.
That data will need to be set with a new acl manually to allow group read permissions.

Share data in your home directory with other users

Assume that you want to share a directory (e.g data) and its files and subdirectories, but it is not readable by other users,

> ls -ld /users/PAA1234/john/data
drwxr-x--- 3 john PAA1234 4096 Nov 21 11:59 /users/PAA1234/john/data

Like before, allow the user execute permissions to $HOME.

> nfs4_setfacl -a A::userid@osc.edu:X $HOME

set an ACL to the directory 'data'  to allow specific user access:

> cd /users/PAA1234/john
> nfs4_setfacl -R -a A:df:userid@osc.edu:RX data

or  to to allow a specific group access:

> cd /users/PAA1234/john
> nfs4_setfacl -R -a A:dfg:groupname@osc.edu:RX data

You can repeat the above commands to add more users or groups.

Share entire home dir with a group

Sometimes one wishes to share their entire home dir with a particular group. Care should be taken to only share folders with data and not any hidden dirs.

Some folders in a home dir should retain permissions to only allow the user which owns them to read them. An example is the ~/.ssh dir, which should always have read permissions only for the user that owns it.

Use the below command to only assign group read permissions only non-hidden dirs.

for dir in $(ls $HOME); do nfs4_setfacl -R -a A:dfg:<group>@osc.edu:RX $dir; done

After sharing an entire home dir with a group, you can still create a single share folder with the previous instructions to share different data with a different group only. So, all non-hidden dirs in your home dir would be readable by group_a, but a new folder named 'group_b_share' can be created and its acl altered to only share its contents with group_b.

Please contact oschelp@osc.edu if there are any questions.

Supercomputer: 
Service: 

HOWTO: Use POSIX ACL

This document shows you how to use the POSIX ACL permissions system. An ACL (access control list) is a list of permissions associated with a file or directory. These permissions allow you to restrict access to a certian file or directory by user or group.

These commands are useful for project and scratch dirs located in /fs/proejct, /fs/scratch, /fs/ess.

Understanding POSIX ACL

An example of a basic POSIX ACL would look like this:

# file: foo.txt 
# owner: tellison 
# group: PZSXXXX 
user::rw- 
group::r-- 
other::r--

The first three lines list basic information about the file/directory in question: the file name, the primary owner/creator of the file, and the primary group that has permissions on the file. The following three lines show the file access permissions for the primary user, the primary group, and any other users. POSIX ACLs use the basic rwx permissions, explaned in the following table:

Permission Explanation
r Read-Only Permissions
w Write-Only Permissions
x

Execute-Only Permissions

 

Using POSIX ACL

This section will show you how to set and view ACLs, using the setfacl and getfacl commands

Viewing ACLs with getfacl

The getfacl command displays a file or directory's ACL. This command is used as the following

$ getfacl [OPTION] file

Where file is the file or directory you are trying to view. Common options include:

Flag Description
-a/--access Display file access control list only
-d/--default Display default access control list only (only primary access), which determines the default permissions of any files/directories created in this directory
-R/--recursive Display ACLs for subdirectories
-p/--absolute-names Don't strip leading '/' in pathnames

Examples:

A simple getfacl call would look like the following:

$ getfacl foo.txt 
# file: foo.txt
# owner: user
# group: PZSXXXX
user::rw-
group::r--
other::r--

A recursive getfacl call through subdirectories will list each subdirectories ACL separately

$ getfacl -R foo/
# file: foo/
# owner: user
# group: PZSXXXX
user::rwx
group::r-x
other::r-x

# file: foo//foo.txt
# owner: user
# group: PZSXXXX
user::rwx
group::---
other::---

# file: foo//bar
# owner: user
# group: PZSXXXX
user::rwx
group::---
other::---

# file: foo//bar/foobar.py
# owner: user
# group: PZSXXXX
user::rwx
group::---
other::---

Setting ACLs with setfacl

The setfacl command allows you to set a file or directory's ACL. This command is used as the following

$ setfacl [OPTION] COMMAND file

Where file is the file or directory you are trying to modify. 

Commands and Options

setfacl takes several commands to modify a file or directory's ACL

Command Function
-m/--modify=acl

modify the current ACL(s) of files. Use as the following

setfacl -m u/g:user/group:r/w/x file

-M/--modify-file=file

read ACL entries to modify from a file. Use as the following

setfaclt -M file_with_acl_permissions file_to_modify

-x/--remove=acl

remove entries from ACL(s) from files. Use as the following

setfaclt -x u/g:user/group:r/w/x file

-X/--remove-file=file

read ACL entries to remove from a file. Use as the following

setfaclt -X file_with_acl_permissions file_to_modify

-b/--remove-all Remove all extended ACL permissions

Common option flags for setfacl are as follows:

Option Function
-R/--recursive Recurse through subdirectories
-d/--default Apply modifications to default ACLs
--test test ACL modifications (ACLs are not modified

Examples

You can set a specific user's access priviledges using the following

setfacl -m u:username:-wx foo.txt

Similarly, a group's access priviledges can be set using the following

setfacl -m g:PZSXXXX:rw- foo.txt

You can remove a specific user's access using the following

setfacl -x user:username foo.txt

Grant a user recursive read access to a dir and all files/dirs under it (notice that the capital 'X' is used to provide execute permissions only to dirs and not files):

setfacl -R -m u:username:r-X shared-dir

Set a dir so that any newly created files or dirs under will inherit the parent dirs facl:

setfacl -d -m u:username:r-X shared-dir

HOWTO: Reduce Disk Space Usage

This HOWTO will demonstrate how to lower ones' disk space usage. The following procedures can be applied to all of OSC's file systems.

We recommend users regularly check their data usage and clean out old data that is no longer needed.

Users who need assistance lowering their data usage can contact OSC Help.

Preventing Excessive Data Usage Before It Starts

Users should ensure that their jobs are written in such a way that temporary data is not saved to permanent file systems, such as the project space file system or their home directory.

If your job copies data from the scratch file system or its node's local disk ($TMPDIR) back to a permanent file system, such as the project space file system or a home directory ( /users/PXX####/xxx####/), you should ensure you are only copying the files you will need later.

Identifying Old and Large Data

The following commands will help you identify old data using the find command.

find commands may produce an excessive amount of output. To terminate the command while it is running, click CTRL + C.

Find all files in a directory that have not been accessed in the past 100 days:

This command will recursively search the users home directory and give a detailed listing of all files not accessed in the past 100 days.

The last access time atime is updated when a file is opened by any operation, including grep, cat, head, sort, etc.

find ~ -atime +100 -exec ls -l {} \;
  • To search a different directory replace ~ with the path you wish to search. A period . can be used to search the current directory.
  • To view files not accessed over a different time span, replace 100 with your desired number of days.
  • To view the total size in bytes of all the files found by find, you can add | awk '{s+=$5} END {print "Total SIZE (bytes): " s}' to the end of the command:
find ~ -atime +100 -exec ls -l {} \;| awk '{s+=$5} END {print "Total SIZE (bytes): " s}'

Find all files in a directory that have not been modified in the past 100 days:

This command will recursively search the users home directory and give a detailed listing of all files not modified in the past 100 days.

The last modified time mtime is updated when a file's contents are updated or saved. Viewing a file will not update the last modified time.

find ~ -mtime +100 -exec ls -l {} \; 
  • To search a different directory replace ~ with the path you wish to search. A period . can be used to search the current directory.
  • To view files not modified over a different time span, replace 100 with your desired number of days.
  • To view the total size in bytes of all the files found by find, you can add | awk '{s+=$5} END {print "Total SIZE (bytes): " s}' to the end of the command:
find ~ -mtime +100 -exec ls -l {} \;| awk '{s+=$5} END {print "Total SIZE (bytes): " s}'

List files larger than a specified size:

Adding the -size <size> option and argument to the find command allows you to only view files larger than a certain size. This option and argument can be added to any other find command.

For example, to view all files in a users home directory that are larger than 1GB:

find ~ -size 1G -exec ls -l {} \;

Deleting Identified Data

CAUTION: Be careful when deleting files. Be sure your command will do what you want before running it. Extra caution should be used when deleting files from a file system that is not backed up, such as the scratch file system.

If you no longer need the old data, you can delete it using the rm command.

If you need to delete a whole directory tree (a directory and all of its subcontents, including other directories), you can use the rm -R command.

For example, the following command will delete the data directory in a users home directory:

rm -R ~/data

If you would like to be prompted for confirmation before deleting every file, use the -i option.

rm -Ri ~/data 

Enter y or n when prompted. Simply pressing the enter button will default to n.

Deleting files found by find

The rm command can be combined with any find command to delete the files found. The syntax for doing so is:

find <location> <other find options> -exec rm -i {} \;

Where <other find options> can include one or more of the options -atime <time>, -mtime <time>, and -size <size>.

The following command would find all files in the ~/data directory 1G or larger that have not been accessed in the past 100 days, and then prompt for confirmation to delete each file:

find ~/data -atime +100 -size 1G -exec rm -i {} \;

If you are absolutely sure the files identified by find are okay to delete you can remove the -i option to rm and you will not be prompted. Extreme caution should be used when doing so!

Archiving Data

If you still need the data but do not plan on needing the data in the immediate future, contact OSC Help to discuss moving the data to an archive file system. Requests for data to be moved to the archive file system should be larger than 1TB.

Compressing

If you need the data but do not access the data frequently, you should compress the data using tar or gzip.

Moving Data to a Local File System

If you have the space available locally you can transfer your data there using sftp or Globus.

Globus is recommended for large transfers.

The OnDemand File application should not be used for transfers larger than 1GB.

Supercomputer: 
Service: 

HOWTO: Submit Homework to Repository at OSC

This page outlines a way a professor can set up a file submission system at OSC for his/her classroom project.

Usage for Professor

After connecting to OSC system, professor runs submit_prepare as

$ /users/PZS0645/support/bin/submit_prepare

Follow the instruction and provided the needed information (name of the assignment, TA username if appropriate, a size limit if not the default 1000MB per student, and whether or not you want the email notification of a submit). It will create a designated directory where students submit their assignments, as well as generate submit for students used to submit homework to OSC, both of which are located in the directory specified by the professor.

If you want to create multiple directories for different assignments, simply run the following command again with specifying the different assignment number:

$ /users/PZS0645/support/bin/submit_prepare

Note:

The PI can also enforce the deadline by simply changing the permission of the submission directory or renaming the submission directory at the deadline.

(Only works on Owens): One way is to use at command following the steps below:

  • Use at command to specify the deadline:
at [TIME]

where TIME is formatted HH:MM AM/PM MM/DD/YY. For example:

at 2:30 PM 08/21/2017
  • After running this command, run:
$ chmod 700 [DIRECTORY]

where DIRECTORY is the assignment folder to be closed off.

  • Enter [ctrl+D] to submit this command.

The permission of DIRECTORY will be changed to 700 at 2:30PM, August 21, 2018. After that, the student will get an error message when he/she tries to submit an assignment to this directory.

Usage for Students

A student should create one directory which includes all the files he/she wants to submit before running this script to submit his/her assignment. Also, the previous submission of the same assignment from the student will be replaced by the new submission. 

To submit the assignment, the student runs submit after connecting to OSC system as

$ /path/to/directory/from/professor/submit

Follow the instructions. It will allow students to submit an assignment to the designated directory specified by the professor and send a confirmation email, or return an error message.

Supercomputer: 
Service: 

HOWTO: Submit multiple jobs using parameters

Often users want to submit a large number of jobs all at once, with each using different parameters for each job. These parameters could be anything, including the path of a data file or different input values for a program. This how-to will show you how you can do this using a simple python script, a CSV file, and a template script. You will need to adapt this advice for your own situation.

Consider the following batch script:

#!/bin/bash
#SBATCH --ntasks-per-node=2
#SBATCH --time=1:00:00
#SBATCH --job-name=week42_data8

# Copy input data to the nodes fast local disk
cp ~/week42/data/source1/data8.in $TMPDIR

cd $TMPDIR

# Run the analysis
full_analysis data8.in data8.out

# Copy results to proper folder
cp  data8.out ~/week42/results

Let's say you need to submit 100 of these jobs on a weekly basis. Each job uses a different data file as input. You recieve data from two different sources, and so your data is located within two different folders. All of the jobs from one week need to store their results in a single weekly results folder. The output file name is based upon the input file name.

Creating a Template Script

As you can see, this job follows a general template.  There are three main parameters that change in each job:

  1. The week 
    • Used as part of the job name
    • Used to find the proper data file to copy to the nodes local disk
    • Used to copy the results to the correct folder
  2. The data source
    • Used to find the proper data file to copy to the nodes local disk
  3. The data file's name
    • Used as part of the job name
    • Used to find the proper data file to copy to the nodes local disk
    • Used to specify both the input and output file to the program full_analysis
    • Used to copy the results to the correct folder

If we replace these parameters with variables, prefixed by the dollar sign $and surrounded by curly braces { }, we get the following template script:

Slurm does not support using variables in the #SBATCH section, so we need to set the job name in the submit command.
#!/bin/bash
#SBATCH --ntasks-per-node=2
#SBATCH --time=1:00:00

# Copy input data to the nodes fast local disk 
cp ~/${WEEK}/data/${SOURCE}/${DATA}.in $TMPDIR
cd $TMPDIR

# Run the analysis 
full_analysis ${DATA}.in ${DATA}.out

# Copy results to proper folder
cp  ${DATA}.out ~/${WEEK}/results

Automating Job Submission

We can now use the sbatch --exportoption to pass parameters to our template script. The format for passing parameters is:

sbatch --job-name=name --export=var_name=value[,var_name=value...]

Submitting 100 jobs using the sbatch --export option manually does not make our task much easier than modifying and submitting each job one by one. To complete our task we need to automate the submission of our jobs. We will do this by using a python script that submits our jobs using parameters it reads from a CSV file.  

Note that python was chosen for this task for its general ease of use and understandability -- if you feel more comfortable using another scripting language feel free to interpret/translate this python code for your own use.

The script for submitting multiple jobs using parameters can be found at ~support/share/misc/submit_jobs.py

Use the following command to run a test with the examples already created:

Make sure to replace <your-proj-code> with a project you are a member of to charge jobs to.
~support/share/misc/submit_jobs.py -t ~support/share/misc/submit_jobs_examples/job_template2.sh WEEK,SOURCE,DATA ~support/share/misc/submit_jobs_examples/parameters_example2.csv <your-proj-code>

This script will open the CSV file and step through the file line by line, submitting a job for each line using the line's values. If the submit command returns a non-zero exit code, usually indicating it was not submitted, we will print this out to the display. The jobs will be submitted using the general format (using the example WEEK,SOURCE,DATA environment variables):

sbatch -A <project-account> -o ~/x/job_logs/x_y_z.job_log --job-name=x_y_z --export=WEEK=x,SOURCE=y,DATA=z job.sh

Where x, y and z are determined by the values in the CSV parameter file. Below we relate x to week, y to source and z to data.

Creating a CSV File

We now need to create a CSV file with parameters for each job. This can be done with a regular text editor or using a spreadsheet editor such as Excel. By default you should use commas as your delimiter.  

Here is our CSV file with parameters:

week42,source1,data1
week42,source1,data2
week42,source1,data3
...
week42,source2,data98
week42,source2,data99
week42,source2,data100

The submit script would read in the first row of this CSV file and form and execute the command:

sbatch -A <project-account> -o week42/job_logs/week42_source1_data1.job_log --job-name=week42_source1_data1 --export=WEEK=week42,SOURCE=source1,DATA=data1 job.sh

Submitting Jobs

Once all the above is done, all you need to do to submit your jobs is to make sure the CSV file is populated with the proper parameters and run the automatic submission script with the right flags.

Try using submit_jobs.py --help for an explanation:

$ ~support/share/misc/submit_jobs.py --help
usage: submit_jobs.py [-h] [-t]
                      jobscript parameter_names job_parameters_file account

Automatically submit jobs using a csv file; examples in
~support/share/misc/submit_jobs_examples/

positional arguments:
  jobscript            job script to use
  parameter_names      comma separated list of names for each parameter
  job_parameters_file  csv parameter file to use
  account              project account to charge jobs to

optional arguments:
  -h, --help           show this help message and exit
  -t, --test           test script without submitting jobs
Before submitting a large number of jobs for the first time using this method it is recommended you test with a small number of jobs and using the -t flag as well to check the submit commands.

Modifying for unique uses

It is a good idea to copy the ~support/share/misc/submit_jobs.py file and modify for unique use cases.

Contact oschelp@osc.edu and OSC staff can assist if there are questions using the default script or adjusting the script for unique use cases.

HOWTO: Tune Performance

Table of Contents

Introduction

Setup

Debugging

Hardware

Performance Measurement

Timing

Profiling

Help From the Compiler

Compiler Optimization Reports

Memory Optimizations

Vectorization/Streaming

OpenMP

MPI

GPU Accelerated Computing

Summary

 

Introduction

This tutorial goes over techniques to tune the performance of your application. Keep in mind that correctness of results, code readability/maintainability, and portability to future systems are more important than performance. Some factors that can affect performance are:

  • Effective use of processor features for a high degree of internal concurrency in a single core
  • Mmory access patterns (memory access is slow compared to computation)
  • Use of an appropriate file system for file I/O
  • Scalability of algorithms
  • Compiler optimizations
  • Explicit parallelism

We will be using this code based on the HPCCD miniapp from Mantevo. It performs the Conjugate Gradient (CG) on a 3D chimney domain. CG is an iterative algorithm to numerically approximate the solution to a system of linear equations.

Run code with:

srun -n <numprocs> ./test_HPCCG nx ny nz

where nx, ny, nz are the number of nodes in the x, y, and z dimension on each processor.

 

Setup

First start an interactive Pitzer Desktop session with OnDemand.

You need to load intel 19.0.5 and mvapich2 2.3.3:

module load intel/19.0.5 mvapich2/2.3.3

Then clone the repository:

git clone https://code.osu.edu/khuvis.1/performance_handson.git

 

Debugging

Debuggers let you execute your program one line at a time, inspect variable values, stop your programming at a particular line, and open a core file after the program crashes.

For debugging, use the -g flag and remove optimzation or set to -O0. For example:

icc -g -o mycode.c
gcc -g -O0 -o mycode mycode.c

To see compiler warnings and diagnostic options:

icc -help diag
man gcc

ARM DDT

ARM DDT is a commercial debugger produced by ARM. It can be loaded on all OSC clusters:

module load arm-ddt

To run a non-MPI program from the command line:

ddt --offline --no-mpi ./mycode [args]

To run an MPI program from the command line:

ddt --offline -np num.procs ./mycode [args]
Hands On

Compile and run the code:

make
srun -n 2 ./test_HPCCG 150 150 150

You should have received the following error message at the end of the program output:

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 308893 RUNNING AT p0200
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPPLICATIN TERMINATED WITH EXIT STRING: Segmentation fault (signal 11)
This typically referes to a problem with your application.
Please see tthe FAQ page for debugging suggestions

Set compiler flags -O0 -g  to CPP_OPT_FLAGS in Makefile. Then recompile and run with ARM DDT:

make clean; make
module load arm-ddt
ddt -np 2 ./test_HPCCG 150 150 150

 

Solution

ddt_handson.png

When DDT stops on the segmentation fault, the stack is in the YAML_Element::~YAML_Element function of YAML_Element.cpp. Looking at this function, we see that the loop stops at children.size() instead of children.size()-1. So, line 13 should be changed from

for(size_t i=0; i<=children.size(); i++) {

to

for(size_t i=0; i<children.size(); i++) {

 

Hardware

On Pitzer, there are 40 cores per node (20 cores per socket and 2 sockets per node). There is support for AVX512, vector length 8 double or 16 single precision values and fused multiply-add. (There is hardware support for 4 thread per core, but it is currently not enabled on OSC systems.)

There are three cache levels on Pitzer, and the statistics are shown in the table below:

Pitzer Cache Statistics
Cache level Size (KB) Latency (cycles) Max BW (bytes/cycle) Sustained BW (bytes/cycle)
L1 DCU 32 4-6 192 133
L2 MLC 1024 14 64 52
L3 LLC 28160 50-70 16 15

Never do heavy I/O in your home directory. Home directories are for long-term storage, not scratch files.

One option for I/O intensive jobs is to use the local disk on a compute node. Stage files to and from your home directory into $TMPDIR using the pbsdcp command (e.g. pbsdcp file1 file2 $TMPDIR), and execute the program in $TMPDIR.

Another option is to use the scratch file system ($PFSDIR). This is faster than other file systems, good for parallel jobs, and may be faster than local disk.

For more information about OSC's file system, click here.

For example batch scripts showing the use of $TMPDIR and $PFSDIR, click here.

For more information about Pitzer, click here.

 

Performance Measurement

FLOPS stands for "floating point operations per second." Pitzer has a theoretical maximum of 720 teraflops. With the LINPACK benchmark of solving a dense system of linear equations, 543 teraflops. With the STREAM benchmark, which measures sustainable memory bandwidth and the corresponding computation rate for vector kernels, copy: 299095.01 MB/s, scale: 298741.01 MB/s, add: 331719.18 MB/s, and traid: 331712.19 MB/s. Application performance is typically much less than peak/sustained performance since applications usually do not take full advantage of all hardware features.

 

Timing

You can time a program using the /usr/bin/time command. It gives results for user time (CPU time spent running your program), system time (CPU time spent by your program in system calls), and elapsed time (wallclock). It also shows % CPU, which is (user + system) / elapsed, as well as memory, pagefault, swap, and I/O statistics.

/usr/bin/time j3
5415.03user 13.75system 1:30:29elapsed 99%CPU \
(0avgtext+0avgdata 0maxresident)k \
0inputs+0outputs (255major+509333minor)pagefaults 0 swaps

You can also time portions of your code:

  C/C++ Fortran 77/90 MPI (C/C++/Fortran)
Wallclock

time(2), difftime(3),

getrusage(2)

SYSTEM_CLOCK(2) MPI_Wtime(3)
CPU times(2) DTIME(3), ETIME(3) X

 

Profiling

A profiler can show you whether code is compute-bound, memory-bound, or communication bound. Also, it shows how well the code uses available resources and how much time is spent in different parts of your code. OSC has the following profiling tools: ARM Performance Reports, ARM MAP, Intel VTune, Intel Trace Analyzer and Collector (ITAC), Intel Advisor, TAU Commander, and HPCToolkit.

For profiling, use the -g flag and specify the same optimization level that you normally would normally use with -On. For example:

icc -g -O3 -o mycode mycode.c

Look for

  • Hot spots (where most of the time is spent)
  • Excessive number of calls to short functions (use inlining!)
  • Memory usage (swapping and thrashing are not allowed at OSC)
  • % CPU (low CPU utilization may mean excessive I/O delays).

ARM Performance Reports

ARM PR works on precompiled binaries, so the -g flag is not needed. It gives a summary of your code's performance that you can view with a browser.

For a non-MPI program:

module load arm-pr
perf-report --no-mpi ./mycode [args]

For an MPI program:

module load arm-pr
perf-report --np num_procs ./mycode [args]

ARM MAP

Interpreting this profile requires some expertise. It gives details about your code's performance. You can view and explore the resulting profile using an ARM client.

For a non-MPI program:

module load arm-map
map --no-mpi ./mycode [args]

For an MPI program:

module load arm-pr
map --np num_procs ./mycode [args]

For more information about ARM Tools, view OSC resources or visit ARM's website.

Intel Trace Analyzer and Collector (ITAC)

ITAC is a graphical tool for profiling MPI code (Intel MPI).

To use:

module load intelmpi # then compile (-g) code
mpiexec -trace ./mycode

View and explore the results using a GUI with traceanalyzer:

traceanalyzer <mycode>.stf

 

Help From the Compiler

HPC software is traditionally written in Fortran or C/C++. OSC supports several compiler families. Intel (icc, icpc, ifort) usually gives fastest code on Intel architecture). Portland Group (PGI - pgcc, pgc++, pgf90) is good for GPU programming, OpenACC. GNU (gcc, g++, gfortran) is open source and universally available.

Compiler options are easy to use and let you control aspects of the optimization. Keep in mind that different compilers have different values for options.  For all compilers, any highly optimized builds, such as those employing the options herein, should be thoroughly validated for correctness.

Some examples of optimization include:

  • Function inlining (eliminating function calls)
  • Interprocedural optimization/analysis (ipo/ipa)
  • Loop transformations (unrolling, interchange, splitting, tiling)
  • Vectorization (operate on arrays of operands)
  • Automatic parallization of loops (very conservative multithreading)

Compiler flags to try first are:

  • General optimization flags (-O2, -O3, -fast)
  • Fast math
  • Interprocedural optimization/analysis

Faster operations are sometimes less accurate. For Intel compilers, fast math is default with -O2 and -O3. If you have a problem, use -fp-model precise. For GNU compilers, precise math is default with -O2 and -O3. If you want faster performance, use -ffast-math.

Inlining is replacing a subroutine or function call with the actual body of the subprogram. It eliminates overhead of calling the subprogram and allows for more loop optimizations. Inlining for one source file is typically automatic with -O2 and -O3.

Optimization Compiler Options

Options for Intel compilers are shown below. Don't use -fast for MPI programs with Intel compilers. Use the same compiler command to link for -ipo with separate compilation. Many other optimization options can be found in the man pages. The recommended options are -O3 -xHost. An example is ifort -O3 program.f90.

-fast Common optimizations
-On

Set optimization level (0, 1, 2, 3)

-ipo Interprocedural optimization, multiple files
-O3 Loop transforms
-xHost Use highest instruction set available
-parallel Loop auto-parallelization

Options for PGI compilers are shown below. Use the same compiler command to link for -Mipa with separate compilation. Many other optimization options can be found in the man pages. The recommended option is -fast. An example is pgf90 -fast program.f90.

-fast Common optimizations
-On

Set optimization level (0, 1, 2, 3, 4)

-Mipa Interprocedural optimization
-Mconcur Loop auto-parallelization

Options for GNU compilers are shown below. Use the same compiler command to link for -Mipa with separate compilation. Many other optimization options can be found in the man pages. The recommended options are -O3 -ffast-math. An example is gfortran -O3 program.f90.

-On Set optimization level (0, 1, 2, 3)
N/A for separate compilation Interprocedural optimization
-O3 Loop transforms
-ffast-math Possibly unsafe floating point optimizations
-march=native Use highest instruction set available

Hands On

Compile and run with different compiler options:

time srun -n 2 ./test_HPCCG 150 150 150

Using the optimal compiler flags, get an overview of the bottlenecks in the code with the ARM performance report:

module load arm-pr
perf-report -np 2 ./test_HPCCG 150 150 150
Solution

 

On Pitzer, sample times were:

Compiler Option Runtime (seconds)
-g 129
-O0 -g 129
-O1 -g 74
-O2 -g 74
-O3 -g

74

The performance report shows that the code is compute-bound.

pr_handson.png

 

Compiler Optimization Reports

Compiler optimization reports let you understand how well the compiler is doing at optimizing your code and what parts of your code need work. They are generated at compile time and describe what optimizations were applied at various points in the source code. The report may tell you why optimizations could not be performed.

For Intel compilers, -qopt-report and outputs to a file.

For Portland Group compilers, -Minfo and outputs to stderr.

For GNU compilers, -fopt-info and ouputs to stderr by default.

A sample output is:

LOOP BEGIN at laplace-good.f(10,7)
   remark #15542: loop was not vectorized: inner loop was already vectorized

   LOOP BEGIN at laplace-good.f(11,10)
   <Peeled loop for vectorization>
   LOOP END

   LOOP BEGIN at laplace-good.f(11,10)
      remark #15300: LOOP WAS VECTORIZED
   LOOP END

   LOOP BEGIN at laplace-good.f(11,10)
   <Remainder loop for vectorization>
      remark #15301: REMAINDER LOOP WAS VECTORIZED
   LOOP END

   LOOP BEGIN at laplace-good.f(11,10)
   <Remainder loop for vectorization>
   LOOP END
LOOP END

Hands On

Add the compiler flag -qopt-report=5 and recompile to view an optimization report.

Vectorization/Streaming

Code is structured to operate on arrays of operands. Vector instructions are built into the processor. On Pitzer, the vector length is 16 single or 8 double precision. The following is a vectorizable loop:

do i = 1,N
  a(i) = b(i) + x(1) * c(i)
end do

Some things that can inhibit vectorization are:

  • Loops being in the wrong order (usually fixed by compiler)
  • Loops over derived types
  • Function calls (can sometimes be fixed by inlining)
  • Too many conditionals
  • Indexed array accesses

Hands On

Use ARM MAP to identify the most expensive parts of the code.

module load arm-map
map -np 2 ./test_HPCCG 150 150 150

Check the optimization report previously generated by the compiler (with -qopt-report=5) to see if any of the loops in the regions of the code are not being vectorized. Modify the code to enable vectorization and rerun the code.

Solution

vec_handson.pngMap shows that the most expensive segment of the code is lines 83-84 of HPC_sparsemv.cpp:

for (int j=0; j< cur_nnz; j++)
  y[i] += cur_vals[j]*x[cur_inds[j]];

The optimization report confirms that the loop was not vectorized due to a dependence on y.

optreport_handson.png

Incrementing a temporary variable instead of y[i], should enable vectorization:

for (int j=0; j< cur_nnz; j++)
  sum += cur_vals[j]*x[cur_inds[j]];
y[i] = sum;

Recompiling and rerunning with change reduces runtime from 74 seconds to 63 seconds.​​

Memory Optimizations

Memory access is often the most important factor in your code's performance. Loops that work with arrays should use a stride of one whenever possible. C and C++ are row-major (store elements consecutively by row in 2D arrays), so the first array index should be the outermost loop and the last array index should be the innermost loop. Fortran is column-major, so the reverse is true. You can get factor of 3 or 4 speedup just by using unit stride. Avoid using arrays of derived data types, structs, or classes. For example, use structs of arrays instead of arrays of structures.

Efficient cache usage is important. Cache lines are 8 words (64 bytes) of consecutive memory. The entire cache line is loaded when a piece of data is fetched.

The code below is a good example. 2 cache lines are used for every 8 loop iterations, and it is unit stride:

real*8 a(N), b(N)
do i = 1,N
  a(i) = a(i) + b(i)
end do

! 2 cache lines:
! a(1), a(2), a(3) ... a(8)
! b(1), b(2), b(3) ... b(8)

The code below is a bad example. 1 cache line is loaded for each loop iteration, and it is not unit stride:

TYPE :: node
  real*8 a, b, c, d, w, x, y, z
END TYPE node
TYPE(node) :: s(N)
do i = 1, N
  s(i)%a = s(i)%a + s(i)%b
end do

! cache line:
! a(1), b(1), c(1), d(1), w(1), x(1), y(1), z(1)

Hands On

Look again at the most expensive parts of the code using ARM MAP:

module load arm-map
map -np 2 ./test_HPCCG 150 150 150

Look for any inefficient memory access patterns. Modify the code to improve memory access patterns and rerun the code. Do these changes improve performance?

Solution

Lines 110-148 of generate_matrix.cpp are nested loops:

for (int ix=0; ix<nx; ix++) {
  for (int iy=0; iy<ny; iy++) {
    for (int iz=0; iz<nz; iz++) {
      int curlocalrow = iz*nx*ny+iy*nx+ix;
      int currow = start_row+iz*nx*ny+iy*nx+ix;
      int nnzrow = 0;
      (*A)->ptr_to_vals_in_row[curlocalrow] = curvalptr;
      (*A)->ptr_to_inds_in_row[curlocalrow] = curindptr;
      .
      .
      .
    }
  }
}

The arrays are accessed in a manner so that consecutive values of ix are accesssed in order. However, our loops are ordered so that the ix is the outer loop. We can reorder the loops so that ix is iterated in the inner loop:

for (int iz=0; iz<nz; iz++) {
  for (int iy=0; iy<ny; iy++) {
    for (int ix=0; ix<nx; ix++) {
      .
      .
      .
    }
  }
}

 This reduces the runtime from 63 seconds to 22 seconds.

OpenMP

OpenMP is a shared-memory, threaded parallel programming model. It is a portable standard with a set of compiler directives and a library of support functions. It is supported in compilers by Intel, Portland Group, GNU, and Cray.

The following are parallel loop execution examples in Fortran and C. The inner loop vectorizes while the outer loop executes on multiple threads:

PROGRAM omploop
INTEGER, PARAMETER :: N = 1000
INTEGER i, j
REAL, DIMENSION(N, N) :: a, b, c, x
... ! Initialize arrays
!$OMP PARALLEL DO
do j = 1, N
  do i = 1, N
    a(i, j) = b(i, j) + x(i, j) * c(i, j)
  end do
end do
!$OMP END PARALLEL DO
END PROGRAM omploop
int main() {
  int N = 1000;
  float *a, *b, *c, *x;
... // Allocate and initialize arrays
#pragma omp parallel for
  for (int i = 0; i < N; i++) {
    for (int j = 0; j < N; j++) {
      a[i*N+j] = b[i*N+j] + x[i*N+j] * c[i*N+j]
    }
  }
}

You can add an option to compile a program with OpenMP.

For Intel compilers, add the -qopenmp option. For example, ifort -qopenmp ompex.f90 -o ompex.

For GNU compilers, add the -fopenmp option. For example, gcc -fopenmp ompex.c -o ompex.

For Portland group compilers, add the -mp option. For example, pgf90 -mp ompex.f90 -o ompex.

To run an OpenMP program, requires multiple processors through Slurm (--N 1 -n 40) and set the OMP_NUM_THREADS environment variable (default is use all available cores). For the best performance, run at most one thread per core.

An example script is:

#!/bin/bash
#SBATCH -J omploop
#SBATCH -N 1
#SBATCH -n 40
#SBATCH -t 1:00

export OMP_NUM_THREADS=40
/usr/bin/time ./omploop

For more information, visit http://www.openmp.org, OpenMP Application Program Interface, and self-paced turorials. OSC will host an XSEDE OpenMP workshop on November 5, 2019.

 

MPI

MPI stands for message passing interface for when multiple processes run on one or more nodes. MPI has functions for point-to-point communication (e.g. MPI_Send, MPI_Recv). It also provides a number of functions for typical collective communication patterns, including MPI_Bcast (broadcasts value from root process to all other processes), MPI_Reduce (reduces values on all processes to a single value on a root process), MPI_Allreduce (reduces value on all processes to a single value and distributes the result back to all processes), MPI_Gather (gathers together values from a group of processes to a root process), and MPI_Alltoall (sends data from all processes to all processes).

A simple MPI program is:

#include <mpi.h>
#include <stdio.h>

int main(int argc, char *argv[]) {
  int rank, size;
  MPI_INIT(&argc, &argv);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_COMM_size(MPI_COMM_WORLD, &size);
  printf("Hello from node %d of %d\n", rank size);
  MPI_Finalize();
  return(0);
}

MPI implementations available at OSC are mvapich2, Intel MPI (only for Intel compilers), and OpenMPI.

MPI programs can be compiled with MPI compiler wrappers (mpicc, mpicxx, mpif90). They accept the same arguments as the compilers they wrap. For example, mpicc -o hello hello.c.

MPI programs must run in batch only. Debugging runs may be done with interactive batch jobs. srun automatically determines exectuion nodes from PBS:

#!/bin/bash
#SBATCH -J mpi_hello
#SBATCH -N 2
#SBATCH --ntasks-per-node=40
#SBATCH -t 1:00

cd $PBS_O_WORKDIR
srun ./hello

For more information about MPI, visit MPI Forum and MPI: A Message-Passing Interface Standard. OSC will host an XSEDE MPI workshop on September 3-4, 2019. Self-paced tutorials are available here.

Hands On

Use ITAC to get a timeline of the run of the code.

module load intelmpi
LD_PRELOAD=libVT.so \
mpiexec -trace -np 40 ./test_HPCCG 150 150 150
traceanalyzer <stf_file>

Look at the Event Timeline (under Charts). Do you see any communication patterns that could be replaced by a single MPI command?

Solution

Looking at the Event Timeline, we see that a large part of runtime is spent in the following communication pattern: MPI_Barrier, MPI_Send/MPI_Recv, MPI_Barrier. We also see that during this communication rank 0 is sending data to all other rank. We should be able to replace all of these MPI calls with a single call to MPI_Bcast.

The relavent code is in lines 82-89 of ddot.cpp: 

  MPI_Barrier(MPI_COMM_WORLD);
  if(rank == 0) {
    for(int dst_rank=1; dst_rank < size; dst_rank++) {
      MPI_Send(&global_result, 1, MPI_DOUBLE, dst_rank, 1, MPI_COMM_WORLD);
    }
  }
  if(rank != 0) MPI_Recv(&global_result, 1, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
  MPI_Barrier(MPI_COMM_WORLD);

and can be replaced with:

MPI_Bcast(&global_result, 1, MPI_DOUBLE, 0, MPI_COMM_WORLD);

 

Interpreted Languages

Although many of the tools we already mentioned can also be used with interpreted languages, most interpreted languages such as Python and R have their own profiling tools.

Since they are still running on th same hardware, the performance considerations are very similar for interpreted languages as they are for compiled languages:

  • Vectorization
  • Efficient memory utilization
  • Use built-in and library functions where possible
  • Use appropriate data structures
  • Understand and use best practices for the language

One of Python's most common profiling tools is cProfile. The simplest way to use cProfile is to add several arguments to your Python call so that an ordered list of the time spent in all functions called during executation. For instance, if a program is typically run with the command:

python ./mycode.py

replace that with

python -m cProfile -s time ./mycode.py

Here is a sample output from this profiler:python_profile.png

See Python's documentation for more details on how to use cProfile.

One of the most popular profilers for R is profvis. It is not available by default with R so it will need to be installed locally before its first use and loaded into your environment prior to each use. To profile your code, just put how you would usually call your code as the argument into profvis:

$ R
> install.packages('profvis')
> library('profvis')
> profvis({source('mycode.R')}

Here is a sample output from profvis:
r_profile.png
For more information on profvis is available here.

Hands On

Python

First, enter the Python/ subdirectory of the code containing the python script ns.py. Profile this code with cProfile to determine the most expensive functions of the code. Next, rerun and profile with the array as an argument to ns.py. Which versions runs faster? Can you determine why it runs faster?

Solution

Execute the following commands:

python -m cProfile -s time ./ns.py
python -m cProfile -s time ./ns.py array

In the original code, 66 seconds out 68 seconds are spent in presPoissPeriodic. When the array argument is passed, the time spent in this function is approximately 1 second and the total runtime goes down to about 2 seconds.

The speedup comes from the vectorization of the main computation in the body of presPoissPeriodic by replacing nester for loops with a single like operation on arrays.

R

Now, enter the R/ subdirectory of the code containing the R script lu.R. Make sure that you have the R module loaded. First, run the code with profvis without any additional arguments and then again with frmt="matrix".
Which version of the code runs faster? Can you tell why it runs faster based on the profile?

Solution

Runtime for the default version is 28 seconds while the runtime when frmt="matrix" is 20 seconds.
Here is the profile with default arguments:rprof_df_trace.png

And here is the profile with frmt="matrix":
rprof_mat_trace.png

We can see that most of the time is being spent in lu_decomposition. The difference, however, is that the dataframe version seems to have a much higher overhead associated with accessing elements of the dataframe. On the other hand, the profile of the matrix version seems to be much flatter with fewer functions being called during LU decomposition. This reduction in overhead by using a matrix instead of a dataframe results in the better performance.

Supercomputer: 
Service: 

HOWTO: Tune VASP Memory Usage

This article discusses memory tuning strategies for VASP.

Data Distribution

Typically the first approach for memory sensitive VASP issues is to tweak the data distribution (via NCORE or NPAR). The information at https://www.vasp.at/wiki/index.php/NPAR covers a variety of machines. OSC has fast communications via Infiniband. 

Performance and memory consumption are dependent on the simulation model.  So we recommend a series of benchmarks varying the number of nodes and NCORE. The recommended initial value for NCORE is the processor count per node which is the ntasks-per-node value in Slurm (the ppn value in PBS). Of course, if this benchmarking is intractable then one must reexamine the model. For general points see:  https://www.vasp.at/wiki/index.php/Memory_requirements and https://www.vasp.at/wiki/index.php/Not_enough_memory  And of course one should start small and incrementally improve or scale up one's model.

Rationalization

Using the key parameters with respect to memory scaling listed at the VASP memory requirements page one can rationalize VASP memory usage.  The general approach is to study working calculations and then apply that understanding to scaled up or failing calculations.  This might help one identify if a calculation is close to a node's memory limit and happens to cross over the limit for reasons that might be out of ones control, in which case one might need to switch to higher memory nodes.

Here is an example of rationalizing memory consumption.  Extract from a simulation output the key parameters:

Dimension of arrays:
k-points NKPTS = 18 k-points in BZ NKDIM = 18 number of bands NBANDS= 1344
total plane-waves NPLWV = 752640
...
dimension x,y,z NGXF= 160 NGYF= 168 NGZF= 224
support grid NGXF= 320 NGYF= 336 NGZF= 448

This yields 273 GB of memory, NKDIM*NBANDS*NPLWV*16 + 4*(NGXF/2+1)*NGYF*NGZF*16, according to
https://www.vasp.at/wiki/index.php/Memory_requirements

This estimate should be compared to actual memory reports.  See for example XDModD and grafana.  Note that most application software has an overhead in the ballpack of ten to twenty percent.  In addition, disk caching can consume significant memory.  Thus, one must adjust the memory estimate upward.  It can then be comapred to the available memory per cluster and per cluster node type.

Miscellaneous

  • OSC sets the default resource limits for shells, except for core dump file size, to unlimited; see the limit/ulimit/unlimit commands depending on your shell.
  • In the INCAR input file NWRITE=3 is for verbose output and NWRITE=4 is for debugging output.
  • OSC does not have a VASP license and our staff has limited experience with it.  So investigate alternate forms of help:  ask within your research group and  post on the VASP mailing list.
  • Valgrind is a tool that can be used for many types of debugging including looking for memory corruptions and leaks. However, it slows down your code a very sizeable amount. This might not be feasible for HPC codes
  • ASAN (address sanitizer) is another tool that can be used for memory debugging. It is less featureful than Valgrind, but runs much quicker, and so will likely work with your HPC code.

 

Supercomputer: 
Service: 

HOWTO: Use 'rclone' to Upload Data

rclone is a tool that can be used to upload and download files to a cloud storage (like Microsoft OneDrive, BuckeyeBox) from the command line. It's shipped as a standalone binary, but requires some user configuration before using. In this page, we will provide instructions on how to use rclone to upload data to OneDrive. For instructions with other cloud storage, check rclone Online documentation.

Setup

Before configuration, please first log into OSC OnDemand and request a Pitzer VDI session. Walltime of 1 hour should be sufficient to finish the configuration.  

Note: this does not work with the 'konqueror' browser present on OSC Systems. Please set default to Firefox first before you do the setup following the instructions below:
* xfce: Applications (Top left corner) -> Settings -> Preferred Applications
* mate: System (top bar towards the left) -> Preferences -> Preferred Applications

Once the session is ready, open a terminal. In the terminal, run the command

rclone config

It prompts you with a bunch of questions:

  • It shows "No remotes found -- make a new one" or list available remotes you made before
    •  Answer "n" for "New remote"
  • "name>" (the name for the new remote)
    • Type "OneDrive" (or whatever else you want to call this remote)
  • "Storage>" (the storage type of the new remote)
    • This should display a list to choose from. Enter the number corresponding to the "Microsoft OneDrive" storage type, which is "26".
    • (It is "6" for BuckeyeBox)
  • "client_id>"
    • Leave this blank (just press enter).
  • "client_secret>"
    • Leave this blank (just press enter).
  • "Edit advanced config?"
    • Type "n" for no
  • "Use auto config?"
    • Answer "y" for yes
  • A web browser window should pop up allowing you to log into box. It is a good idea at this point to verify that the url is actually OneDrive before entering any credentials 
    • Enter your OSU email
    • This should take you to the OSU login page. Login with your OSU credentials 
    • Go back to the terminal once "Success" is displayed.
  • "Your choice>"
    • One of five options to locate the drive you wish to use.
    • Type "1" to use your personal or business OneDrive
  • "Choose drive to use"
    • Type "0"
  • "Is this Okay? y/n>"
    • Type "y" to confirm the drive you wish to use is correct.
  • "y/e/d>"
    • Type "y" to confirm you wish to add this remote to rclone.

Testing rclone

Note: you do not need to use Pitzer VDI when you run 'rclone'. You can test the data transfer with a small file using login nodes (either Pitzer or Owens), or request a regular compute node to do the data transfer with large files. 

Create an empty hello.txt file and upload it to OneDrive using 'rclone copy' as below in a terminal:

touch hello.txt
rclone copy hello.txt OneDrive:/test

This creates a toplevel directory in OneDrive called 'test' if it does not already exist, and uploads the file hello.txt to it.

To verify the uploading is successful, you can either login to OneDrive in a web browser to check the file, or use rclone ls command in the terminal as:

rclone ls OneDrive:/test
Note: be careful when using ls on a large directory, because it's recursive. You can add a '--max-depth 1' flag to  to stop the recursion. 

Reference

 

Supercomputer: 

HOWTO: Use Address Sanitizer

Address Sanitizer is a tool developed by Google detect memory access error such as use-after-free and memory leaks. It is built into GCC versions >= 4.8 and can be used on both C and C++ codes. Address Sanitizer uses runtime instrumentation to track memory allocations, which mean you must build your code with Address Sanitizer to take advantage of it's features.

There is extensive documentation on the AddressSanitizer Github Wiki.

Memory leaks can increase the total memory used by your program. It's important to properly free memory when it's no longer required. For small programs, loosing a few bytes here and there may not seem like a big deal. However, for long running programs that use gigabytes of memory, avoiding memory leaks becomes increasingly vital. If your program fails to free the memory it uses when it no longer needs it, it can run out of memory, resulting in early termination of the application. AddressSanitizer can help detect these memory leaks.

Additionally, AddressSanitizer can detect use-after-free bugs. A use-after-free bug occurs when a program tries to read or write to memory that has already been freed. This is undefined behavior and can lead to corrupted data, incorrect results, and even program crashes.

Building With Address Sanitzer

We need to use gcc to build our code, so we'll load the gcc module:

module load gnu/9.1.0

The "-fsanitize=address" flag is used to tell the compiler to add AddressSanitizer.

Additionally, due to some environmental configuration settings on OSC systems, we must also statically link against Asan. This is done using the "-static-libasan" flag.

It's helpful to compile the code with debug symbols. AddressSanitizer will print line numbers if debug symbols are present. To do this, add the "-g" flag. Additionally, the "-fno-omit-frame-pointer" flag may be helpful if you find that your stack traces do not look quite correct.

In one command, this looks like:

gcc main.c -o main -fsanitize=address -static-libasan -g

Or, splitting into separate compiling and linking stages:

gcc -c main.c -fsanitize=address -g
gcc main.o -o main -fsanitize=address -static-libasan

Notice that both the compilation and linking steps require the "-fsanitize-address" flag, but only the linking step requires "-static-libasan". If your build system is more complex, it might make sense to put these flags in CFLAGS and LDFLAGS environment variables.

And that's it!

Examples

No Leak

First, let's look at a program that has no memory leaks (noleak.c):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, const char *argv[]) {
    char *s = malloc(100);
    strcpy(s, "Hello world!");
    printf("string is: %s\n", s);
    free(s);
    return 0; 
}

To build this we run:

gcc noleak.c -o noleak -fsanitize=address -static-libasan -g

And, the output we get after running it:

string is: Hello world!

That looks correct! Since there are no memory leaks in this program, AddressSanitizer did not print anything. But, what happens if there are leaks?

Missing free

Let's look at the above program again, but this time, remove the free call (leak.c):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, const char *argv[]) {
    char *s = malloc(100);
    strcpy(s, "Hello world!");
    printf("string is: %s\n", s);
    return 0;
}

Then, to build:

gcc leak.c -o leak -fsanitize=address -static-libasan

And the output:

string is: Hello world!

=================================================================
==235624==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 100 byte(s) in 1 object(s) allocated from:
    #0 0x4eaaa8 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:144
    #1 0x5283dd in main /users/PZS0710/edanish/test/asan/leak.c:6
    #2 0x2b0c29909544 in __libc_start_main (/lib64/libc.so.6+0x22544)

SUMMARY: AddressSanitizer: 100 byte(s) leaked in 1 allocation(s).

This is a leak report from AddressSanitizer. It detected that 100 bytes were allocated, but never freed. Looking at the stack trace that it provides, we can see that the memory was allocated on line 6 in leak.c

Use After Free

Say we found the above leak in our code, and we wanted to fix it. We need to add a call to free. But, what if we add it in the wrong spot?

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, const char *argv[]) {
    char *s = malloc(100);
    free(s);
    strcpy(s, "Hello world!");
    printf("string is: %s\n", s);
    return 0;
}

The above (uaf.c) is clearly wrong. Albiet a contrived example, the allocated memory, pointed to by "s", was written to and read from after it was freed.

To Build:

gcc uaf.c -o uaf -fsanitize=address -static-libasan

Building it and running it, we get the following report from AddressSanitizer:

=================================================================
==244157==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b0000000f0 at pc 0x00000047a560 bp 0x7ffcdf0d59f0 sp 0x7ffcdf0d51a0
WRITE of size 13 at 0x60b0000000f0 thread T0
    #0 0x47a55f in __interceptor_memcpy ../../.././libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:790
    #1 0x528403 in main /users/PZS0710/edanish/test/asan/uaf.c:8
    #2 0x2b47dd204544 in __libc_start_main (/lib64/libc.so.6+0x22544)
    #3 0x405f5c  (/users/PZS0710/edanish/test/asan/uaf+0x405f5c)

0x60b0000000f0 is located 0 bytes inside of 100-byte region [0x60b0000000f0,0x60b000000154)
freed by thread T0 here:
    #0 0x4ea6f7 in __interceptor_free ../../.././libsanitizer/asan/asan_malloc_linux.cc:122
    #1 0x5283ed in main /users/PZS0710/edanish/test/asan/uaf.c:7
    #2 0x2b47dd204544 in __libc_start_main (/lib64/libc.so.6+0x22544)

previously allocated by thread T0 here:
    #0 0x4eaaa8 in __interceptor_malloc ../../.././libsanitizer/asan/asan_malloc_linux.cc:144
    #1 0x5283dd in main /users/PZS0710/edanish/test/asan/uaf.c:6
    #2 0x2b47dd204544 in __libc_start_main (/lib64/libc.so.6+0x22544)

SUMMARY: AddressSanitizer: heap-use-after-free ../../.././libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:790 in __interceptor_memcpy
Shadow bytes around the buggy address:
  0x0c167fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c167fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c167fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c167fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c167fff8000: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
=>0x0c167fff8010: fd fd fd fd fd fa fa fa fa fa fa fa fa fa[fd]fd
  0x0c167fff8020: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa
  0x0c167fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c167fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c167fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c167fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==244157==ABORTING

This is a bit intimidating. It looks like there's alot going on here, but it's not as bad as it looks. Starting at the top, we see what AddressSanitizer detected. In this case, a "WRITE" of 13 bytes (from our strcpy). Immediately below that, we get a stack trace of where the write occured. This tells us that the write occured on line 8 in uaf.c in the function called "main".

Next, AddressSanitizer reports where the memory was located. We can ignore this for now, but depending on your use case, it could be helpful information.

Two key pieces of information follow. AddressSanitizer tells us where the memory was freed (the "freed by thread T0 here" section), giving us another stack trace indicating the memory was freed on line 7. Then, it reports where it was originally allocated ("previously allocated by thread T0 here:"), line 6 in uaf.c.

This is likely enough information to start to debug the issue. The rest of the report provides details about how the memory is laid out, and exactly which addresses were accessed/written to. You probably won't need to pay too much attention to this section. It's a bit "down in the weeds" for most use cases.

Heap Overflow

AddresssSanitizer can also detect heap overflows. Consider the following code (overflow.c):

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, const char *argv[]) {
    // whoops, forgot c strings are null-terminated
    // and not enough memory was allocated for the copy
    char *s = malloc(12);
    strcpy(s, "Hello world!");
    printf("string is: %s\n", s);
    free(s);
    return 0;
}

The "Hello world!" string is 13 characters long including the null terminator, but we've only allocated 12 bytes, so the strcpy above will overflow the buffer that was allocated. To build this:

gcc overflow.c -o overflow -fsanitize=address -static-libasan -g -Wall

Then, running it, we get the following report from AddressSanitizer:

==168232==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000003c at pc 0x000000423454 bp 0x7ffdd58700e0 sp 0x7ffdd586f890
WRITE of size 13 at 0x60200000003c thread T0
    #0 0x423453 in __interceptor_memcpy /apps_src/gnu/8.4.0/src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:737
    #1 0x5097c9 in main /users/PZS0710/edanish/test/asan/overflow.c:8
    #2 0x2ad93cbd7544 in __libc_start_main (/lib64/libc.so.6+0x22544)
    #3 0x405d7b  (/users/PZS0710/edanish/test/asan/overflow+0x405d7b)

0x60200000003c is located 0 bytes to the right of 12-byte region [0x602000000030,0x60200000003c)
allocated by thread T0 here:
    #0 0x4cd5d0 in __interceptor_malloc /apps_src/gnu/8.4.0/src/libsanitizer/asan/asan_malloc_linux.cc:86
    #1 0x5097af in main /users/PZS0710/edanish/test/asan/overflow.c:7
    #2 0x2ad93cbd7544 in __libc_start_main (/lib64/libc.so.6+0x22544)

SUMMARY: AddressSanitizer: heap-buffer-overflow /apps_src/gnu/8.4.0/src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:737 in __interceptor_memcpy
Shadow bytes around the buggy address:
  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa 00 fa fa fa 00[04]fa fa fa fa fa fa fa fa
  0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==168232==ABORTING

This is similar to the use-after-free report we looked at above. It tells us that a heap buffer overflow occured, then goes on to report where the write happened and where the memory was originally allocated. Again, the rest of this report describes the layout of the heap, and probably isn't too important for your use case.

C++ Delete Mismatch

AddressSanitizer can be used on C++ codes as well. Consider the following (bad_delete.cxx):

#include <iostream>
#include <cstring>

int main(int argc, const char *argv[]) {
    char *cstr = new char[100];
    strcpy(cstr, "Hello World");
    std::cout << cstr << std::endl;

    delete cstr;
    return 0;
}

What's the problem here? The memory pointed to by "cstr" was allocated with new[]. An array allocation must be deleted with the delete[] operator, not "delete".

To build this code, just use g++ instead of gcc:

g++ bad_delete.cxx -o bad_delete -fsanitize=address -static-libasan -g

And running it, we get the following output:

Hello World
=================================================================
==257438==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60b000000040
    #0 0x4d0a78 in operator delete(void*, unsigned long) /apps_src/gnu/8.4.0/src/libsanitizer/asan/asan_new_delete.cc:151
    #1 0x509ea8 in main /users/PZS0710/edanish/test/asan/bad_delete.cxx:9
    #2 0x2b8232878544 in __libc_start_main (/lib64/libc.so.6+0x22544)
    #3 0x40642b  (/users/PZS0710/edanish/test/asan/bad_delete+0x40642b)

0x60b000000040 is located 0 bytes inside of 100-byte region [0x60b000000040,0x60b0000000a4)
allocated by thread T0 here:
    #0 0x4cf840 in operator new[](unsigned long) /apps_src/gnu/8.4.0/src/libsanitizer/asan/asan_new_delete.cc:93
    #1 0x509e5f in main /users/PZS0710/edanish/test/asan/bad_delete.cxx:5
    #2 0x2b8232878544 in __libc_start_main (/lib64/libc.so.6+0x22544)

SUMMARY: AddressSanitizer: alloc-dealloc-mismatch /apps_src/gnu/8.4.0/src/libsanitizer/asan/asan_new_delete.cc:151 in operator delete(void*, unsigned long)
==257438==HINT: if you don't care about these errors you may set ASAN_OPTIONS=alloc_dealloc_mismatch=0
==257438==ABORTING

This is similar to the other AddressSanitizer outputs we've looked at. This time, it tells us there's a mismatch between new and delete. It prints a stack trace for where the delete occured (line 9) and also a stack trace for where to allocation occured (line 5).

Performance

The documentation states:

This tool is very fast. The average slowdown of the instrumented program is ~2x

AddressSanitizer is much faster than tools that do similar analysis such as valgrind. This allows for usage on HPC codes.

However, if you find that AddressSanitizer is too slow for your code, there are compiler flags that can be used to disable it for specific functions. This way, you can use address sanitizer on cooler parts of your code, while manually auditing the hot paths.

The compiler directive to skip analyzing functions is:

__attribute__((no_sanitize_address)

 

Supercomputer: 
Technologies: 

HOWTO: Use Cron and OSCusage for Regular Emailed Reports

It is possible to utilize Cron and the OSCusage command to send regular usage reports via email

Cron

It is easy to create Cron jobs on the Owens and Pitzer clusters at OSC. Cron is a Linux utility which allows the user to schedule a command or script to run automatically at a specific date and time. A cron job is the task that is scheduled.

Shell scripts run as a cron job are usually used to update and modify files or databases; however, they can perform other tasks, for example a cron job can send an email notification.

Getting Help

In order to use what cron has to offer, here is a list of the command name and options that can be used

Usage: 
crontab [options] file 
crontab [options] 
crontab -n [hostname] 
Options: 
-u  define user 
-e edit user's crontab 
-l list user's crontab 
-r delete user's crontab 
-i prompt before deleting 
-n  set host in cluster to run users' crontabs 
-c get host in cluster to run users' crontabs 
-s selinux context 
-x  enable debugging 

Also, if this is your first time using cron, you will be asked to choose an editor for setting your cron job. Choose whatever you find to be easiest for you.

Running a Cron Job

To check for any running cron jobs on the server, use the command (As shown above)

crontab -l 

and to create and edit your cron job use the following command,

crontab -e 

Now, in order to write you first cron job, you need to be familiar with the formatting system that cron follows.

Linux Crontab Format

The formatting system has 6 fields, each field from 1-5 is used to define the date and time of the execution. The 6th field is used for the command or script to be executed. The format is the following,

MIN HOUR DOM MON DOW CMD 

where,

figure 1: Cron’s formatting syntax

Getting Notified by Email Using a Cron Job

You can get an email notification using a cron job as mentioned earlier. The following is an example of a cron job that runs every minute and sends an email notification every minute,

* * * * * {cmd} | mail -s "title of the email notification" {your email} 

A user can also set up email notifications regarding usage by using the OSCusage cmd,

12 15 * * * /opt/osc/bin/OSCusage | mail -s "OSC usage on $(date)" {your email} 2> /path/to/file/for/stdout/and/stderr 2>&1 

This cron job will run every day at (15:12 or 3:12 PM).

Using OSCusage

The OSCusage command offers many options, the following is a list that pertains to that,

$ /opt/osc/bin/OSCusage --help 
usage: OSCusage.py [-h] [-u USER] 
[-s {opt,pitzer,glenn,bale,oak,oakley,owens,ruby}] [-A] 
[-P PROJECT] [-q] [-H] [-r] [-n] [-v] 
[start_date] [end_date] 

positional arguments: 
start_date start date (default: 2020-04-23) 
end_date end date (default: 2020-04-24) 

optional arguments: 
-h, --help show this help message and exit 
-u USER, --user USER username to run as. Be sure to include -P or -A. (default: kalattar) 
-s {opt,pitzer,glenn,bale,oak,oakley,owens,ruby}, --system {opt,pitzer,glenn,bale,oak,oakle 
-A Show all 
-P PROJECT, --project PROJECT project to query (default: PZS0715) 
-q show user data 
-H show hours 
-r show raw 
-n show job ID 
-v do not summarize 

As it can be seen, one could for example use OSCusage to receive information regarding another user’s usage with the -u option and write a cron script that is set up with email notification.

Some other usage examples,

 OSCusage 2018-01-24 

where the command specifies the usage’s start time. The end time could also be specified with,

OSCusage 2018-01-24 2018-01-25 

Terminating a Cron Job

To terminate a cron job, you need to first determine the process id,

ps aux | grep crontab 

and then use,

kill {PID}

A user can also just clear out the cron script with,

crontab -e 
Supercomputer: 

HOWTO: Use Docker and Singularity Containers at OSC

It is now possible to run Docker and Singularity containers on the Owens and Pitzer clusters at OSC. Single-node jobs are currently supported, including GPU jobs; MPI jobs are planned for the future.

From the Docker website:  "A container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings."

This document will describe how to run Docker and Singularity containers on the Owens and Pitzer. You can use containers from Docker Hub, Sylabs CloudSingularity Hub, or any other source. As examples we will use hello-world from Singularity Hub and ubuntu from Docker Hub.

If you encounter any error, check out Known Issues on using Singularity at OSC. If the issue can not be resolved, please contact OSC help

Contents

  1. Access a container
  2. Run a container
  3. File system access
  4. GPU usage within a container
  5. Build a container
  6. References

Getting help

The most up-to-date help on Singularity comes from the command itself.

singularity help

User guides and examples can be found at Sylabs.io.

Setting up your environment for Singularity usage

No setup is required. You can use Singularity directly on all clusters.

Accessing a container

A Singularity container is a single file with a .sif  extension.

* IMPORTANT NOTE: Prior to version 3.0,  Singularity uses .simg or .img as a single file extesnion when you pull out a container from a hub.

You can simply download ("pull") a container from a hub. Popular hubs are Docker Hub and Singularity Hub. You can go there and search if they have a container that meets your needs. Docker Hub has more containers and may be more up to date but supports a much wider community than just HPC. Singularity Hub is for HPC, but the number of available containers are fewer. Additionally there are domain and vendor repositories such as biocontainers and NVIDIA HPC containers that may have relevant containers.

Pull a container from hubs

Docker Hub

Pull from the 7.2.0 branch of the gcc repository on Docker Hub. The 7.2.0 is called a tag.

singularity pull docker://gcc:7.2.0

Filename:  gcc_7.2.0.sif

Pull an Ubuntu container from Docker Hub.

singularity pull docker://ubuntu:18.04

Filename:  ubuntu_18.04.sif

Singularity Hub

Pull the singularityhub/hello-world ontainer from the Singularity hub. Since no tag is specified it pulls from the master branch of the repository.

singularity pull shub://singularityhub/hello-world

Filename:  hello-world_latest.sif

Downloading containers from the hubs is not the only way to get one. You can, for example get a copy from your colleague's computer or directory. If you would like to create your own container you can start from the user guide below. If you have any questions, please contact OSC Help.

Running a container

There are four ways to run a container under Singularity.

You can do this either in a batch job or on a login node.

Don’t run on a login node if the container will be performing heavy computation, of course.
If unsure about the amount of memory that a singularity process will require, then be sure to request an entire node for the job. It is common for singularity jobs to be killed by the OOM killer because of using too much RAM.

We note that the operating system on Owens is Red Hat:

[owens-login01]$ cat /etc/os-release
NAME="Red Hat Enterprise Linux Server"
VERSION="7.5 (Maipo)"
ID="rhel"
[..more..]

In the examples below we will often check the operating system to show that we are really inside a container.

Run container like a native command

If you simply run the container image it will execute the container’s runscript.

Example:  Run singularityhub/hello-world

Note that this container returns you to your native OS after you run it.

[owens-login01]$ ./hello-world_latest.sif
Tacotacotaco

Use the “run” sub-command

The Singularity “run” sub-command does the same thing as running a container directly as described above. That is, it executes the container’s runscript.

Example:  Run a container from a local file

[owens-login01]$ singularity run hello-world_latest.sif
Tacotacotaco

Example:  Run a container from a hub without explicitly downloading it

[owens-login01]$ singularity run shub://singularityhub/hello-world
INFO: Downloading shub image
Progress |===================================| 100.0%
Tacotacotaco

Use the “exec” sub-command

The Singularity “exec” sub-command lets you execute an arbitrary command within your container instead of just the runscript.

Example:  Find out what operating system the singularityhub/hello-world container uses

[owens-login01]$ singularity exec hello-world_latest.sif cat /etc/os-release
NAME="Ubuntu"
VERSION="14.04.5 LTS, Trusty Tahr"
ID=ubuntu
[..more..] 

Use the “shell” sub-command

The Singularity “shell” sub-command invokes an interactive shell within a container.

Example:  Run an Ubuntu shell. Note the “Singularity” prompt within the shell.

[owens-login01 ~]$ singularity shell ubuntu_18.04.sif
Singularity ubuntu_18.04.sif:~> cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04 LTS (Bionic Beaver)"
ID=ubuntu
[.. more ..] 
Singularity ubuntu_18.04.sif:~> exit
exit

File system access

When you use a container you run within the container’s environment.  The directories available to you by default from the host environment are

  • your home directory
  • working directory (directory you were in when you ran the container)
  • /fs/project
  • /fs/scratch
  • /tmp

You can review our Available File Systems page for more details about our file system access policy.

If you run the container within a job you will have the usual access to the $PFSDIR environment variable with adding node attribute "pfsdir" in the job request (nodes=XX:ppn=XX:pfsdir). You can access most of our file systems from a container without any special treatment.

GPU usage within a container

If you have a GPU-enabled container you can easily run it on Owens or Pitzer just by adding the --nv flag to the singularity exec or run command.  The example below comes from the "exec" command section of Singularity User Guide.  It runs a TensorFlow example using a GPU on Owens.  (Output has been omitted from the example for brevity.)

[owens-login01]$ sinteractive -n 28 -g 1
...
[o0756]$ git clone https://github.com/tensorflow/models.git
[o0756]$ singularity exec --nv docker://tensorflow/tensorflow:latest-gpu \
python ./models/tutorials/image/mnist/convolutional.py

In some cases it may be necessary to bind the CUDA_HOME path and add $CUDA_HOME/lib64 to the shared library search path:

[owens-login01]$ sinteractive -n 28 -g 1
...
[o0756]$ module load cuda
[o0756]$ export SINGULARITY_BINDPATH=$CUDA_HOME
[o0756]$ export SINGULARITYENV_LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$CUDA_HOME/lib64
[o0756]$ singularity exec --nv my_container mycmd

Build a container

If you want to create or modify a container, you need root-like privilege on any OSC system. Alternatively you can create a container on a hub or a local computer then pull/upload it to OSC system.  

References

 

 
Supercomputer: 

HOWTO: Use Extensions with JupyterLab

JupyterLab stores the main build of JupyterLab with associated data, including extensions in Application Directory. The default Application Directory is the JupyterLab installation directory where is read-only for OSC users. Unlike Jupyter Notebook, JupyterLab cannot accommodate multiple paths for extensions management. Therefore we set the user's home directory for Application Directory so as to allow user to manage extensions.

NOTE: The extension management is only available for JupyterLab 2 or later.

Manage and install extensions

After launching a JupyterLab session, open a notebook and run

!jupyter lab path

Check if home directory is set for to the Application Directory

Application directory:   /users/PXX1234/user/.jupyter/lab/3.0
User Settings directory: /users/PXX1234/user/.jupyter/lab/user-settings
Workspaces directory: /users/PXX1234/user/ondemand/data/sys/dashboard/batch_connect/dev/bc_osc_jupyter/output/f2a4f918-b18c-4d2a-88bc-4f4e1bdfe03e

If home directory is NOT set, try removing the corresonding directory, e.g. if you are using JupyterLab 2.2, remove the entire directory $HOME/.jupyter/lab/2.2 and re-launch JupyterLab 2.2.

If this is the first time to use extension or use extensions that are installed with different Jupyter version or on different cluster, you will need to run

!jupyter lab build

to initialize the JupyterLab application.

To manage and install extensions, simply click Extension Manager icon at the side bar:

Screen Shot 2021-07-27 at 1.30.45 PM.png

Please note that OSC Jupyter app is a portal to launch JupyterLab installed on OSC. It does not act the same as the standalone Jupyter installed on your computer. Some extensions that work on your computer might not work with OSC Jupyter. If you experience any issue, please contact OSC help

 

 
Supercomputer: 
Service: 

HOWTO: Use Globus (Overview)

 

GlobusOnline-265x258.jpg

Globus is a cloud-based service designed to let users move, share, and discover research data via a single interface, regardless of its location or number of files or size.

Globus was developed and is maintained at the University of Chicago and is used extensively at supercomputer centers and major research facilities.

Globus is available as a free service that any user can access. More on how Globus works can be found on the Globus "How It Works" page.

OSC deprecated Globus v4 endpoint on Tuesday, Jan 26, 2021. Please use Globus v5.4 endpoint by searching 'OSC Globus Connect Server' for transferring and sharing data. See this page for more information on how to transfer data and this page for more information on how to share data. 

Further Reading

HOWTO: Install Globus Connect Personal

Globus is a reliable, high-performance file transfer platform allowing users to transfer large amounts of data seamlessly between systems. It aims to make transfers a "click-and-forget" process by setting up configuration details in the background and automating fault recovery.  

Globus can be used for both file transfers between OSC and:

  • A computing institution with Globus installed (check with your site provider for availability) or
  • A personal computer (known as a personal endpoint)

Users transferring between OSC and another computing institution with​ Globus installed do not need to install Globus Connect Personal, and can skip this page.

Install Globus Connect Personal

To use Globus to transfer from a personal computer, you will need to install the Globus Connect Personal client on your computer following the steps below. Those transferring between OSC and another computing institution can skip to Usage.

 

Watch How to Install Globus Personal
Video about 'How to Transfer Files Using Globus'

 

  1. Login to Globus. If your institution does not have an organizational login, you may choose to either Sign in with Google or Sign in with ORCiD iD. b.png                                              
  2. Download the Globus Connect Personal Client 
    • In the main Globus web page, click "ENDPOINTS" on the sidebar.
    • Click "Create a personal endpoint" on the top-right of the pageendpoint_new.png
    • Download the Globus Connect client for your operating systemdownload after clicking create endpoint 2.PNG
  3. Install Globus Connect Personal Client
    • Windows
      1. Run the Installer
      2. Copy-Paste the setup key to complete the installation
    • Mac​​
      1. Mount your drives
      2. Copy the Globus Client to your Application Folder
      3. Start The Globus Client, and enter the provided setup key
    • Linux
      1. Un-tar the .tgz file with the command tar -zxvf
      2. Run globusconnect , found within the unzipped directory
      3. Copy-Paste the setup key when prompted
  4. Setup Globus Personal
    • The setup process will ask you to login. Click Loginsetup login.PNG
    • This will bring up a browser window with Globus asking for permissions for Globus Personal to have access to. Click "allow" to continue.Allow access.PNG
    • This will bring up a Globus Personal window up to finish the setup process. Fill out the required information and click "Save".   complete setup.PNG
  5. (Optional) Changing directories accessible to Globus

By default, Globus will only add certain default folders to the list of files and directories accessible by Globus. To change/add/remove files and directories from this list:

Windows

  1. Start Globus Connect Personal
  2. Go to "Options"
  • Add directories/files using the  "+" button
  • Remove directories/files using the "-" button
  • Revert to the default accessible directories/files using the "Reset to Defaults" button
  • Any changes you make are not made permanent until you press the "Save" button

​​Mac

  1. Start Globus Connect Personal
  2. Go to Preferences -> Access
  • Add directories/files using the  "+" button
  • Remove directories/files using the "-" button
  • Revert to the default accessible directories/files using the "Reset to Defaults" button
  • Any changes you make are not made permanent until you press the "Save" button

​Linux

  1. Configure the   ~/.globusonline/lta/config-paths  file. This file is a plain text file, with each line corresponding to the configuration of a particular directory path you wish to make accessible. Each line consists of 3 comma-separated fields as below
    <path1>,<sharing flag>,<R/W flag>
    <path2>,<sharing flag>,<R/W flag>
    <path3>,<sharing flag>,<R/W flag>
    ...
    

    Path: an absolute directory/path to be permitted. A leading tilde "~" can be used to represent the home directory of the user that runs globusconnectpersonal.

    Sharing Flag: it controls sharing, with a value of "1" allowing sharing for the path and a value of "0" disallowing sharing for the path.

    R/W Flag: it determines whether the path will be accessible read-only or read/write, with a "1" permitting read/write access and a "0" marking the path as read-only. 

  2. After making a change to the ~/.globusonline/lta/config-paths file, you must stop and restart Globus Connect Personal before the changes will take effect as below
$ ./globusconnectpersonal -stop
$ ./globusconnectpersonal -start &

HOWTO: Transfer files using Globus

Globus is a reliable, high-performance file transfer platform allowing users to transfer large amounts of data seamlessly between systems. It aims to make transfers a "click-and-forget" process by setting up configuration details in the background and automating fault recovery.  

Globus can be used for both file transfers between OSC and:

  • A computing institution with Globus installed (check with your site provider for availability) or
  • A personal computer (known as a personal endpoint). See this page on how to install Globus Connect Personal

Users transferring between OSC and another computing institution with​ Globus installed do not need to install Globus Connect Personal.

If you are looking to transfer smaller sized files you can utilize OnDemand's file transfer capabilities, or use an SFTP client to connect to  sftp.osc.edu . Our general recommendation is that for small files - measured in MB to several hundred MB - to use OnDemand or SFTP. You can continue to use SFTP and get reasonable performance up to file sizes of several GB. For transfers of several GB or larger, you should consider using Globus.

Transfer files using Globus

We provide instructions on how to transfer data on this page. If you would like to share your data with your collaborators, please see this page

If you are doing a large transfer you should transfer to/from the parallel file system for best performance. 

Watch How to Transfer Files Using Globus
Video about 'How to Transfer Files Using Globus'

 

  1. Login to Globus. If your institution does not have an organizational login, you may choose to either Sign in with Google or Sign in with ORCiD iD.b.png
  2. If you do not log in with your OSC HPC credentials, map your OSC credentials to your Globus account following this page. Select 'Ohio Supercomputer Center (OSC)' as the identity provider. You will be redirected to the page below. Provide your OSC HPC credentials: OSC_login.png

  3. Navigate to the "File Manager" on the sidebar. Select the two-panel view by switching the icon beside "Panels" on the top-right of the page if the view is one panel switch.png
  4. Enter your endpoint in one of the collections
    • If transferring to a computer with Globus Connect Personal installed, make sure to have Globus Connect Personal running. The name will be the unique name chosen during installation. It is also listed under "Administered by You" under the "ENDPOINTS" on the sidebar
  5. Enter 'OSC Globus Connect Server' in the other collection box to search all the collections managed by OSC as below. Choose the collection where you want to do the data transfer:
    •   Endpoint
      OSC's home directory OSC $HOME
      OSC's project directory OSC /fs/project
      OSC's scratch directory OSC /fs/scratch
      OSC's ess storage OSC /fs/ess
      AWS S3 storage OSC S3
    • If needed, enter your username and password for the authentication
    • You should see the files in the collection. You can also change the directory by entering the appropriate path in the Path boxtransfer.png
  6. You can now transfer files and directories both ways by selecting them and pressing the arrow indicating which way you'd like to transferstart.png
  7. Once you click 'Start', an activity report will appear as shown below, and you can click on it to see the detailsstatus.png
  8. Click "View details', you will get the activity report about the transfer. You will also receive an email once the transfer is completed (successful or failed)
Once a transfer has begun, you do not need to keep the Globus webpage up, but you will need to make sure the Globus Connect Personal Client is running on your computer until it has completed.  If the transfer is interrupted for any reason, Globus will attempt to re-initiate the transfer automatically.

HOWTO: Share data using Globus

Globus Connect Server allows OSC users to share data with their collaborators who do not have OSC HPC account (the collaborator needs to sign up for a free Globus account though). The advantage of data sharing via Globus is that you do not have to move your data in order to share it. You can select directory paths to be securely shared with your collaborator, and grant them read-only or read-write access.

Note: You can only share directories, not individual files. 

Sharing data with collaborators 

Watch How to Share Files Using Globus
Video about 'How to Share Files Using Globus

 

  • Login to Globus. If your institution does not have an organizational login, you may choose to either Sign in with Google or Sign in with ORCiD iDb.png
  • Map your OSC credentials to your Globus account following this page if you haven't done this before. Select 'Ohio Supercomputer Center (OSC)' as the identity provider from the list. You will be redirected to the page below. Provide your OSC HPC credentials:OSC_login.png
  • Navigate to the "ENDPOINTS" on the sidebar. Enter 'OSC Globus Connect Server' in the search box to search all the endpointss managed by OSC as below:
  Endpoint
OSC's home directory OSC $HOME
OSC's project directory OSC /fs/project
OSC's scratch directory OSC /fs/scratch
OSC's ess storage OSC /fs/ess
AWS S3 storage OSC S3
  • Choose the endpoint where you want to share the data. You will be redirected to the overview page of this endpoint. Click the 'Collections' menu. You will see all of the collections added under this endpoint before. To add a new collection, click 'Add a Collection'. 
  • Create a guest collection. (Note: the default path will be $HOME for home directory, /fs/project for project filesystem, /fs/scratch for scratch filesystem, and /fs/ess for ess storage. You can change to a more specific directory by providing the path in ‘Directory’). Click 'Create Collection' to finish the creation.arcum
  • Navigate to the "File Manager" on the sidebar. Select the view to be one-panel by switching the icon beside "Panels" on the top-right of the pageswitch.png
  • Enter the collection you want to share by clicking 'select a collection': choose_sharing.png
  • Click 'Your Collections' to choose the right collection choose_s.png
  • Click 'Share' button in the right side paneshare button 2.PNG
  • To share with your collaborators, click 'Add Permissions - Share With'. You can specify the directory you want to share, choose to share with users by selecting their email address or Globus username, or share with a group or all Globus users or everyone. You can also select 'Send Email' to send email to your collaborator. If you want your collaborator to also be able to transfer data into the directory, you can also provide write access by checking the 'write' button. Once the options have been selected, click 'Add Permission'. You should be redirected to the information page on this collection, and see the people you have shared it with.
    • You can repeat this process for any number of collaborators.sharewith.pngsharewith_2.png
  • To terminate the sharing, you can click the 'trash can' icon next to the invitee on the screen.trashing sharing connection 2.PNG
  • To see all collections you have shared, go to 'Endpoints' in the left bar, then 'Shareable by You'.allshare.png
If your collaborator has the write access, he/she can transfer data and delete files within that directory and any subdirectories. So be careful about providing write access.
It is highly recommended that you terminate the sharing or delete the collection when your collaborator has completed downloading the data. 

Information for your collaborators 

  1. Your collaborator needs to sign up for a free Globus account for data sharing
  2. Your collaborator needs to install the Globus Connect Personal client on his/her personal computer if he/she needs to download the data. See this page for more information. 
  3. If you select 'Send Email' to send an email to your collaborator, he/she will get an email including the URL to access the share. Your collaborator can click on the link and log in with the credentials
  4. Your collaborator can also access to the share by logging into Globus. Navigate to the "ENDPOINTS" on the sidebar and click on 'Shared with You'. 

Further Reading

 

    HOWTO: Use AWS S3 in Globus

    Globus Connect Server v5 allows OSC clients to connect to Amazon S3. Please follow the steps below:

    Create a New Bucket

    • Log into Amazon AWS
    • Navigate to Services -> S3
    • Click 'Create Bucket'
    • Give the bucket a name then click 'Next'
    • Click 'Next' and then 'Next' again then 'Create bucket' to create a new bucket

    Create a New IAM

    • Log into Amazon AWS
    • Navigate to Services -> IAM

    Add Policy

    • Select Policies and then click 'Create Policy'
    • Choose the JSON tab and paste the JSON policy as below (It will be necessary to replace the value for 'Resource' to match bucket name from previous steps):
    You may also refer to this page provided by Globus for the JSON policy.
    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "AllBuckets",
                "Effect": "Allow",
                "Action": [
                    "s3:ListAllMyBuckets",
                    "s3:GetBucketLocation"
                ],
                "Resource": "*"
            },
            {
                "Sid": "Bucket",
                "Effect": "Allow",
                "Action": [
                    "s3:ListBucket"
                ],
                "Resource": "arn:aws:s3:::osc-globus-test"
            },
            {
                "Sid": "Objects",
                "Effect": "Allow",
                "Action": [
                    "s3:DeleteObject",
                    "s3:GetObject",
                    "s3:PutObject"
                ],
                "Resource": "arn:aws:s3:::osc-globus-test/*"
           }
        ]
    }
    • Choose 'Review Policy'
    • Fill in the name such as 'AllowGlobusS3_osc-globus-test', click 'Create Policy' to create the new policy

    Add Group

    • Choose 'Groups' in left menu bar then click 'Create New Group'. Provide the group name such as “s3_osc-globus-test” then click 'Next Step' button
    • In search bar for Filter type 'Globus' and check box next to 'AllowGlobusS3_osc-globus-test' then click 'Next Step'
    • Click 'Create Group' after reviewing new group information. 

    Add User

    • Choose 'Users' in left menu bar and then click 'Add user'. Provide the username, check 'Programmatic access' and then click 'Next: Permissions'
    • Choose 'Add user to group', filter for the group created previous and check box next to 's3_osc-globus-test', click 'Next: Tags' and then 'Next: Review'
    • Click 'Create user'
    • Make a copy of 'Access key ID' and click 'Show' for 'Secret access key' and save both for future reference
    • Click 'Close'

    Create a New Collection

    • Login to Globus. If your institution does not have an organizational login, you may choose to either Sign in with Google or Sign in with ORCiD iD
    • Navigate to the 'ENDPOINTS' on the sidebar and search 'OSC, s3'. Click 'OSC S3' to go to this gateway
    • Click the 'Collections' menu. You will see all of the collections added by you before. To add a new collection, click 'Add a Collection'. Enter your OSC username and password for the authentication
    • Provide 'Access key' and "Access secret' from previous steps and associate Globus identity with your OSC identity
    • Provide bucket name for 'Base Directory'
    • Provide the name of the collection in 'Collection Display Name' field
    • Click 'Create Collection' to finish the creation
    • Click the logo of 'Globus' and navigate back to the 'ENDPOINTS' on the sidebar. Click the 'Administered by You' and then you can locate the new collection you just created. 

    Further Reading

     

    HOWTO: Add InCommon Authentication to Globus

    (OPTIONAL) Adding InCommon Authentication 

    Adding InCommon authentication to your Globus account allows you to login to Globus Online using your university credentials.  Using this process you can store your Globus username password for safe keeping, and instead use your university username and password to login.  If your already logged in to your university authentication system, logging in to Globus can be as simple as two clicks away.

    To use this feature, your university needs to be a InCommon participant.  Some Ohio universities active in InCommon include: Ohio State University, Case Western University, Columbus State Community College, Miami University, Ohio Northern University, Ohio University, University of Findlay, University of Dayton, and many more.  

    For a complete list, visit https://incommon.org/participants/ .

    To add InCommon Authentication:
    1. Login to Globus Online
    2. Go to "Manage Identities" under your username
    3. Click "Add External Identity"
      • Choose a name for the Identity Settings. 
      • Choose InCommon / CILogon from the drop down menu
    4. On the next page, choose your University / Identity Provider
    • Click "Remember this selection"
    • Click "Log on"
    • You may be prompted to login to your university authentication system if you are not already

    When you go to login next, click "alternative login" and then "InCommon / CILogon".  Select your university on the next page, and login using your university credentials.  Globus will remember this preference, and automatically prompt you to login using your university authentication next time.

    HOWTO: Deploy your own endpoint on a server

    OSC clients who are affiliated with Ohio State can deploy their own endpoint on a server using OSU subscriptions. Please follow the steps below:

    1. Send a request to OSC Help the following information:
      • Name of organization that will be running the endpoint, ie: OSU Arts and Sciences
        • NOTE: if the name already exists, they will have to coordinate with the existing Admin for that project
      • OSU affiliated email address associated with the Globus account, ie: name.#@osu.edu
    2. OSC will create a new project at https://developers.globus.org, make the user provided in #1 the administrator, and inform the user to set up the endpoint credentials
    3. The user goes to https://developers.globus.org/ and chooses “Register a new Globus Connect Server v5”. Under the project, the user chooses Add dropdown and chooses Add new Globus Connect Server. Provide a display name for the endpoint, ie: datamover02.hpc.osc.edu. Select “Generate New Client Secret” and save that value and Client ID and use those values when configuring the Globus Connect Server install on their local system
    4. The user finishes configuring Globus Connect Server and runs the necessary commands to register the new endpoint with Globus. Once the new endpoint is registered, please email OSC Help the endpoint name so we can mark the endpoint as managed under the OSU subscription

     

    Supercomputer: 

    HOWTO: Use VNC in a batch job

    SSHing directly to a compute node at OSC - even if that node has been assigned to you in a current batch job - and starting VNC is an "unsafe" thing to do. When your batch job ends (and the node is assigned to other users), stray processes will be left behind and negatively impact other users. However, it is possible to use VNC on compute nodes safely.

    You can use OnDemand, which is a much easier way to access desktops. If your work is not a very large, very intensive computation (for example, you do not expect to saturate all of the cores on a machine for a significant portion of the time you have the application you require open - e.g., you are using the GUI to set up a problem for a longer non-interactive compute job), you can choose one VDI under "Virtual Desktop Interface" from "Desktops" menu. Otherwise, please use "Interactive HPC" from Desktops" menu.

    The examples below are for Pitzer. If you use other systems, please see this page for supported versions of TurboVNC on our systems.

    Starting your VNC server

    Step one is to create your VNC server inside a batch job.

    Option 1: Interactive

    The preferred method is to start an interactive job, requesting an gpu node, and then once your job starts, you can start the VNC server.

    salloc --nodes=1 --ntasks-per-node=40 --gpus-per-node=1 --gres=vis --constraint=40core srun --pty /bin/bash
    

    This command requests an entire GPU node, and tells the batch system you wish to use the GPUs for visualization. This will ensure that the X11 server can access the GPU for acceleration. In this example, I have not specified a duration, which will then default to 1 hour.

    module load virtualgl
    module load turbovnc
    

    Then start your VNC server. (The first time you run this command, it may ask you for a password - this is to secure your VNC session from unauthorized connections. Set it to whatever password you desire. We recommend a strong password.)

    vncserver
    
    To set the vnc password again use the vncpasswd command.

    The output of this command is important: it tells you where to point your client to access your desktop. Specifically, we need both the host name (before the :), and the screen (after the :).

    New 'X' desktop is p0302.ten.osc.edu:1
    

    Connecting to your VNC server

    Because the compute nodes of our clusters are not directly accessible, you must log in to one of the login nodes and allow your VNC client to "tunnel" through SSH to the compute node. The specific method of doing so may vary depending on your client software.

    The port assigned to the vncserver will be needed. It is usually 5900 + <display_number>. e.g.

    New 'X' desktop is p0302.ten.osc.edu:1

    would use port 5901.

    Linux/MacOS

    Option 1: Manually create an SSH tunnel 

    I will be providing the basic command line syntax, which works on Linux and MacOS. You would issue this in a new terminal window on your local machine, creating a new connection to Pitzer.

    ssh -L <port>:<node_hostname>.ten.osc.edu:<port> <username>@pitzer.osc.edu
    

    The above command establishes a proper ssh connection for the vnc client to use for tunneling to the node.

    Open your VNC client, and connect to localhost:<screen_number>, which will tunnel to the correct node on Pitzer.

    Option 2: Use your VNC software to tunnel 

    This example uses Chicken of the VNC, a MacOS VNC client. It is a vncserver started on host n0302 with port 5901 and display 1.

    The default window that comes up for Chicken requires the host to connect to, the screen (or port) number, and optionally allows you to specify a host to tunnel through via SSH. This screenshot shows a proper configuration for the output of vncserver shown above. Substitute your host, screen, and username as appropriate.

    When you click [Connect], you will be prompted for your HPC password (to establish the tunnel, provided you did not input it into the "password" box on this dialog), and then (if you set one), for your VNC password. If your passwords are correct, the desktop will display in your client.

    Windows

    This example shows how to create a SSH tunnel through your ssh client.  We will be using Putty in this example, but these steps are applicable to most SSH clients.

    First, make sure you have x11 forwarding enabled in your SSH client.

    Next, open up the port forwarding/tunnels settings and enter the hostname and port you got earlier in the destination field.  You will need to add 5900 to the port number when specifiying it here.  Some clients may have separate boxes for the desination hostname and port.  

    For source port, pick a number between 11-99 and add 5900 to it.  This number between 11-99 will be the port you connect to in your VNC client.

    Make sure to add the forwaded port, and save the changes you've made before exiting the configutations window.

    PuTTY Tunnel Configuration Settings

    Now start a SSH session to the respective cluster your vncserver is running on.  The port forwarding will automatically happen in the background.  Closing this SSH session will close the forwarded port; leave the session open as long as you want to use VNC.

    Now start a VNC client.  TurboVNC has been tested with our systems and is recommended.  Enter localhost:[port], replacing [port] with the port between 11-99 you chose earlier.

    New TurboVNC Connection

    If you've set up a VNC password you will be prompted for it now.  A desktop display should pop up now if everything is configured correctly.

    How to Kill a VNC session?

    Occasionally you may make a mistake and start a VNC server on a login node or somewhere else you did not want to.  In this case it is important to know how to properly kill your VNC server so no processes are left behind.

    The command syntax to kill a VNC session is:

    vncserver -kill :[screen]
    

    In the example above, screen would be 1.

    You need to make sure you are on the same node you spawned the VNC server on when running this command.

    Supercomputer: 
    Service: 
    Fields of Science: 

    HOWTO: Use a Conda/Virtual Environment With Jupyter

    The IPython kernel for a Conda/virtual environment* must be installed on Jupyter prior to use.

    *See create conda/virtual environment if there is not already an environment that has been created.

    Install kernel

    To perform the kernel installation, users should load the preferred version of Python*

    module load python
    
    * Change to the proper version of python (or the version that you created your conda/venv with)
    * Use module spider python to view available python versions

    and run one of the following commands, depending on how your Conda/virtual environment being created. Make sure to replace MYENV with the name of your conda environment or path to the environment.

    if the Conda environment is created via conda create -n MYENV

    ~support/classroom/tools/create_jupyter_kernel conda MYENV

    if the Conda environment is created via conda create -p /path/to/MYENV

    ~support/classroom/tools/create_jupyter_kernel conda /path/to/MYENV

    if the Python virtual environment is created via python3 -m venv /path/to/MYENV

    ~support/classroom/tools/create_jupyter_kernel venv /path/to/MYENV
    

    Install Jupyterlab Debugger kernel

    According to Jupyterlab page, debugger requires ipykernel >= 6. Please create your own kernel with conda using the following commands:

    $ module load miniconda
    $ conda create -n jupyterlab-debugger -c conda-forge "ipykernel>=6" xeus-python
    $ ~support/classroom/tools/create_jupyter_kernel conda jupyterlab-debugger

    You should see a kernelspec 'conda_jupyterlab-debugger' created in home directory. Once the debugger kernel is done, you can use it:
    1. go to OnDemand
    2. request a JupyterLab app with kernel 3
    3. open a notebook with the debugger kernel.
    4. you can enable debug mode at upper-right kernel of the notebook

    Remove kernel

    If the envirnoment is rebuilt or renamed, users may want to erase any custom jupyter kernel installations.

    Be careful! This command will erase entire directories and all files within them.
    rm -rf ~/.local/share/jupyter/kernels/${MYENV}
    

    Manually install kernel

    If the create_jupyter_kernel script does not work for you, try the following steps to manually install kernel:

    # change to the proper version of python
    module load python  
    
    # replace with the name of conda env           
    MYENV='useful-project-name'    
    
    # Activate your conda/virtual environment
    ## For Conda environment
    source activate "$MYENV"
    
    ## For Python Virtual environment
    source /path/to/$MYENV/bin/activate
    
    # Install Jupyter kernel 
    python -m ipykernel install --user --name "$MYENV" --display-name "Python ($MYENV)"

     

    Supercomputer: 
    Service: 
    Fields of Science: 

    HOWTO: Use an Externally Hosted License

    Many software packages require a license.  These licenses are usually made available via a license server, which allows software to check out necessary licenses.  In this document external refers to a license server that is not hosted inside OSC.

    If you have such a software license server set up using a license manager, such as FlexNet, this guide will instruct you on the necessary steps to connect to and use the licenses at OSC.

    Users who wish to host their software licenses inside OSC should consult OSC Help.

     

    You are responsible for ensuring you are following your software license terms.  Please ensure your terms allow you to use the license at OSC before beginning this process!

    Introduction

    Broadly speaking, there are two different ways in which the external license server's network may be configured.  These differ by whether the license server is directly externally reachable or if it sits behind a private internal network with a port forwarding firewall.  

    If your license server sits behind a private internal network with a port forwarding firewall you will need to take additional steps to allow the connection from our systems to the license server to be properly routed. 

    License Server is Directly Externally Reachable

    Figure depicting a License Server with firewall connected to the internet, and an outbound compute node whose traffic is routed through NAT to the internet

    License Server is Behind Port Forwarding Firewall

    Figure depicting a License Server with a Full Port Forwarding Firefall inside a Private Internal Nework connected to the internet, and an outbound compute node whose traffic is routed through NAT to the internet

    Unsure?

    If you are unsure about which category your situation falls under contact your local IT administrator.

    Configure Remote Firewall

     OSC changed NAT IP addresses on December 14, 2021. Please update the IP addresses of license server configured for the firewall to allow the connections from nat.osc.edu (192.148.249.248 to 192.148.249.251). 

    In order for connections from OSC to reach the license server, the license server's firewall will need to be configured.  All outbound network traffic from all of OSC's compute nodes are routed through a network address translation host (NAT).

    The license server should be configured to allow connections from nat.osc.edu including the following IP addresses to the SERVER:PORT where the license server is running:

    • 192.148.249.248
    • 192.148.249.249
    • 192.148.249.250
    • 192.148.249.251

    A typical FlexNet-based license server uses two ports: one is server port and the other is daemon port, and the firewall should be configured for the both ports. A typical license file looks, for example,

    SERVER licXXX.osc.edu 0050XXXXX5C 28000
    VENDOR {license name} port=28001
    

    In this example, "28000" is the server port, and "28001" is the daemon port. The daemon port is not mandatory if you use it on a local network, however it becomes necessary if you want to use it outside of your local network. So, please make sure you declared the daemon port in the license file and configured the firewall for the port.

    Confirm Configuration

    The firewall settings should be verified by attempting to connect to the license server from the compute environment using telenet.

    Get on to a compute node by requesting a short, small, interactive job and test the connection using telenet:

    telnet <License Server IP Address> <Port#>
    

    (Recommended) Restrict Access to IPs/Usernames

    It is also recommended to restrict accessibility using the remote license server's access control mechanisms, such as limiting access to particular usernames in the options.dat file used with FlexNet-based license servers.

    For FlexNet tools, you can add the following line to your options.dat file, one for each user.

    INCLUDEALL USER <OSC username>
    

    If you have a large number of users to give access to you may want to define a group using GROUP within the options.dat file and give access to that whole group using INCLUDEALL GROUP <group name> .

    Users who use other license managers should consult the license manager's documentation.

    Modify Job Environment to Point at License Server

    The software must now be told to contact the license server for it's licenses.  The exact method of doing so can vary between each software package, but most use an environment variable that specifies the license server IP address and port number to use.

    For example LS DYNA uses the environment variable LSTC_LICENSE and LSTC_LICENSE_SERVER to know where to look for the license.  The following lines would be added to a job script to tell LS-DYNA to use licenses from port 2345 on server 1.2.3.4, if you use bash:

    export LSTC_LICENSE=network
    export LSTC_LICENSE_SERVER=2345@1.2.3.4
    

    or, if you use csh:

    setenv LSTC_LICENSE network
    setenv LSTC_LICENSE_SERVER 2345@1.2.3.4
    

    License Server is Behind Port Forwarding Firewall

    If the license server is behind a port forwarding firewall, and has a different IP address from the IP address of the firewall, additional steps must be taken to allow connections to be properly routed within the license server's internal network.  

    1. Use the license server's fully qualified domain name in SERVER line in the license file instead of the IP address.
    2. Contact your IT team to have the firewall IP address mapped to the fully qualified domain name.

     

    Software Specific Details

    The following outlines details particular to a specific software package.  

    ANSYS

    Uses the following environment variables:

    
    ANSYSLI_SERVERS=<port>@<IP>
    ANSYSLMD_LICENSE_FILE=<port>@<IP>
    

    If your license server is behind a port forwarding firewall and you cannot use a fully qualified domain name in the license file, you can add ANSYSLI_EXTERNAL_IP={external IP address} to ansyslmd.ini on the license server.

     

     

    HOWTO: Use ulimit command to set soft limits

    This document shows you how to set soft limits using the ulimit command.

    The ulimit command sets or reports user process resource limits. The default limits are defined and applied when a new user is added to the system. Limits are categorized as either soft or hard. With the ulimit command, you can change your soft limits for the current shell environment, up to the maximum set by the hard limits. You must have root user authority to change resource hard limits.

    Syntax

    ulimit [-HSTabcdefilmnpqrstuvx [Limit]]
    flags description
    -H Specifies that the hard limit for the given resource is set. If you have root user authority, you can increase the hard limit. Anyone can decrease it
    -S Specifies that the soft limit for the given resource is set. A soft limit can be increased up to the value of the hard limit. If neither the -H nor -S flags are specified, the limit applies to both
    -a Lists all of the current resource limits
    -b The maximum socket buffer size
    -c The maximum size of core files created
    -d The maximum size of a process's data segment
    -e The maximum scheduling priority ("nice")
    -f The maximum size of files written by the shell and its children
    -i The maximum number of pending signals
    -l The maximum size that may be locked into memory
    -m The maximum resident set size (many systems do not honor this limit)
    -n The maximum number of open file descriptors (most systems do not allow this value to be set)
    -p The pipe size in 512-byte blocks (this may not be set)
    -q The maximum number of bytes in POSIX message queues
    -r The maximum real-time scheduling priority
    -s The maximum stack size
    -t The maximum amount of cpu time in seconds
    -u The maximum number of processes available to a single user
    -v The maximum amount of virtual memory available to the shell and, on some systems, to its children
    -x The maximum number of file locks
    -T The maximum number of threads

    The limit for a specified resource is set when the Limit parameter is specified. The value of the Limit parameter can be a number in the unit specified with each resource, or the value "unlimited." For example, to set the file size limit to 51,200 bytes, use:

    ulimit -f 100

    To set the size of core dumps to unlimited, use:

    ulimit –c unlimited

    How to change ulimit for a MPI program

    The ulimit command affects the current shell environment. When a MPI program is started, it does not spawn in the current shell. You have to use srun to start a wrapper script that sets the limit if you want to set the limit for each process. Below is how you set the limit for each shell (We use  ulimit –c unlimited to allow unlimited core dumps, as an example): 

    1. Prepare your batch job script named "myjob" as below (Here, we request a job with 5-hour 2-cores):
    #!/bin/bash
    #SBATCH --ntasks=2
    #SBATCH --time=5:00:00
    #SBATCH ...
    
    
    ...
    srun ./test1
    ...
    1. Prepare the wrapper script named "test1" as below:
    #!/bin/bash
    ulimit –c unlimited
    .....(your own program)
    1. sbatch myjob
    Supercomputer: 
    Service: