Artificial Intelligence, Data Analytics and Machine Learning

Data Analytics and Machine Learning Icon

OSC provides hardware and software services to support AI, ML and data analytical need for our clients. Data-intensive workloads can be supported with OSC's high-performance computing frameworks that consist of many programming languages, interactive computing with OnDemand, and parallel computing.

Hardware support

OSC's data analytical environment comprised of nodes with powerful CPU cores, large amount of RAM and local disk space to store and process large amount of data.

OWENS

On Owens, data analytical environment has 16 huge memory nodes (Dell PowerEdge R930 four-socket server with Intel Xeon E5-4830 v3 (Haswell 12 core, 2.10GHz) processors, 1,536GB memory, 12 x 2TB drives).

PITZER

Pitzer has 4 huge memory nodes (Dell PowerEdge R940 four-socket server with Intel Xeon 6148 (Skylake 20 core, 2.40GHz) processors, 3TB memory, 2 x 1TB drives mirrored - 1TB usable).

ASCEND

Ascend, a 2,304-core Dell AMD EPYC™ machine, has 24 nodes with 88 usable cores, 921GB of usable memory, and 4 NVIDIA A100 GPUs per node, with 80GB memory each, supercharged by NVIDIA NVLink.

GPU Computing

OSC offers GPU computing on all its systems.  While GPUs can provide a significant boost in performance for some applications the computing model is very different from the CPU. This page discusses some of the ways you can use GPU computing at OSC.

Data Transfer and Storage

Ohio researchers have access to many file storage options at OSC. OSC has over 14 petabytes (PB) of disk storage capacity distributed over several file systems, plus more than 14 PB of available backup tape storage (with the ability to easily expand to over 23PB).

File Transfer

Using our web platform, OnDemand, users can transfer smaller files (<10 GB) using simple drag and drop. Other file transfer options include using sftp from a command line or third-party interface (like Filezilla).

Globus is a simple but powerful transferring service that allows our users to share data with collaborators anywhere! Any remote research sites that run Globus can seamlessly connect to OSC’s many research storage systems. It also connects research systems to personal systems.

 

Software Support

Here is a list of software that we offer related to data analytics and machine learning.

Python:
A popular general-purpose, high-level programming language with numerous mathematical and scientific packages available for data analytics and machine learning. Python programming environment can be accessed through Jupyter App on OnDemand as well.
R:
A programming language for statistical and machine learning applications with very strong graphical capabilities
Rstudio:
RStudio is a free and open-source integrated graphical environment for R. Rstudio is available as OnDemand App with various versions of R.
MATLAB
A full-featured data analysis toolkit with many advanced algorithms readily available. MATLAB is available as an OnDemand App as well.
Spark:
Big data Frameworks based on memory with distributed storage. Spark is available as OnDemand App as well
Hadoop:
Big data Frameworks based on a hard disk with distributed storage
TensorFlow:
TensorFlow is a free and open-source software library for dataflow and differentiable programming across a range of tasks. It is a symbolic math library and is also used for machine learning applications such as neural networks. 
Pytorch:
PyTorch is a python machine learning library based on the Torch library, used for applications such as deep learning and natural language processing.
Horovod:
Horovod is a distributed training framework for TensorFlow, PyTorch, and many more
Intel Compilers:
Compilers for generating optimized code for Intel CPUs.
Intel MKL:
The Math Kernel Library provides optimized subroutines for common computing tasks such as matrix-matrix calculations.  Statistical software: Octave, Stata, FFTW, ScaLAPACK, MINPACK, sprng2
Other statistical softwares:
Octave, Stata, FFTW, ScaLAPACK

Get a complete list of software available at OSC.

Public Dataset

View more about public dataset availability at OSC.

Containers at OSC

OSC now supports containers for several applications. More information is provided here.

Getting Started

If you are new to supercomputing, new to OSC, or simply interested in getting an account (if you don't already have one), please see here for further information.

 

Protected Data Service

OSC's Protected Data Service (PDS) is designed to address the most common security control requirements encountered by researchers while also reducing the workload on individual PIs and research teams to satisfy these requirements.

Security, Accessibility, and Policies

OSC is regularly audited for security standards and abides by OSU's digital accessibility standards. OSC also supports export controlled and HIPPA projects.

OSC Campus Champions

The Ohio Supercomputer Center’s (OSC) Campus Champions program is composed of high performance computing (HPC) advocates at academic institutions across the state. Campus Champions serve as local proponents for access and utilization of OSC resources on their campuses.