Owens

Darshan

Darshan is a lightweight "scalable HPC I/O characterization tool".  It is intended to profile I/O by emitting log files to a consistent log location for systems administrators, and also provides scripts to create summary PDFs to characterize I/O in MPI-based programs.

Availability and Restrictions

Versions

The following versions of Darshan are available on OSC clusters:

Spark

Apache Spark is an open source cluster-computing framework originally developed in the AMPLab at University of California, Berkeley but was later donated to the Apache Software Foundation where it remains today. In contrast to Hadoop's disk-based analytics paradigm, Spark has multi-stage in-memory analytics. Spark can run programs up-to 100x faster than Hadoop’s MapReduce in memory or 10x faster on disk. Spark support applications written in python, java, scala and R

Changes of Default Memory Limits

Problem Description

Our current GPFS file system is a distributed process with significant interactions between the clients. As the compute nodes being GPFS flle system clients, a certain amount of memory of each node needs to be reserved for these interactions. As a result, the maximum physical memory of each node allowed to be used by users' jobs is reduced, in order to keep the healthy performance of the file system. In addition, using swap memory is not allowed.  

The table below summarizes the maximum physical memory allowed for each type of nodes on our systems:

WARP3D

From WARP3D's webpage:

WARP3D is under continuing development as a research code for the solution of large-scale, 3-Dsolid models subjected to static and dynamic loads. The capabilities of the code focus on 
fatigue & fracture analyses primarily in metals. WARP3D runs on laptops-to-supercomputers and can analyze models with several million nodes and elements. 

Availability and Restrictions

Versions

The following versions of WARP3D are available on OSC clusters:

Technical Specifications

The following are technical specifications for Owens.  

Number of Nodes

824 nodes

Number of CPU Sockets

1,648 (2 sockets/node)

Number of CPU Cores

23,392 (28 cores/node)

Cores Per Node

28 cores/node (48 cores/node for Huge Mem Nodes)

Local Disk Space Per Node

~1,500GB in /tmp

Amber 16 now available

Date: 
Friday, July 29, 2016 - 4:15pm
System(s): 

Amber 16 has been installed on the OSC clusters; usage is via the module amber/16. For information on available executables and installation details see the software page for Amber or the output of the module help command, e.g.: module help amber/16.  On August 15, 2016 Amber 16 will be made the default amber module.

R and Rstudio

R is a language and environment for statistical computing and graphics. It is an integrated suite of software facilities for data manipulation, calculation, and graphical display. It includes

  • an effective data handling and storage facility,
  • a suite of operators for calculations on arrays, in particular matrices,
  • a large, coherent, integrated collection of intermediate tools for data analysis,
  • graphical facilities for data analysis and display either on-screen or on hardcopy, and
  • a well-developed, simple and effective programming language which includes conditionals, loops, user-defined recursive functions and input, and output facilities

More information can be found here.

Bowtie2

Bowtie2 is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes.

Pages