System Downtime March 14, 2023
A downtime for OSC HPC systems is scheduled from 7 a.m. to 9 p.m., Tuesday, March 14, 2023. The downtime will affect the Pitzer, Owens and Ascend Clusters, web portals, and HPC file servers. MyOSC (https://my.osc.edu) and state-wide licenses will be available during the downtime. In preparation for the downtime, the batch scheduler will not start jobs that cannot be completed before 7 a.m., March 14, 2023.
OSC enables Globus High Assurance storage endpoint
A new High Assurance Globus endpoint for OSC will be deployed to manage protected data on February 2, 2023.
This will affect current projects at OSC which use Globus to manage their protected data. These projects will need to use the new High Assurance Globus endpoint to access their data. The name of the new endpoints are OSC /fs/ess High Assurance for project storage (/fs/ess) and OSC /fs/scratch High Assurance for scratch storage (/fs/scratch).
Changes on MPI libraries after Dec. 13, 2022 downtime
During the Dec. 13, 2022 downtime, we will upgrade MOFED from 4.9 to 5.6 and recompile all OpenMPI and MVAPICH2 against the newer MOFED version on Owens.
After downtime, users with their own MPI libraries on Owens may see job failures and will need to rebuild their applications linked against the MPI libraries.
System Downtime December 13, 2022
A downtime for OSC HPC systems is scheduled from 7 a.m. to 9 p.m., Tuesday, December 13, 2022. The downtime will affect the Pitzer, Owens and Ascend Clusters, web portals, and HPC file servers. MyOSC (https://my.osc.edu) and state-wide licenses will be available during the downtime. In preparation for the downtime, the batch scheduler will not start jobs that cannot be completed before 7 a.m., December 13, 2022. Jobs that are not started on clusters will be held until after the downtime and then started once the system is returned to production status.
nccl
The NVIDIA Collective Communication Library (NCCL) implements multi-GPU and multi-node communication primitives optimized for NVIDIA GPUs and Networking. NCCL provides routines such as all-gather, all-reduce, broadcast, reduce, reduce-scatter as well as point-to-point send and receive that are optimized to achieve high bandwidth and low latency over PCIe and NVLink high-speed interconnects within a node and over NVIDIA Mellanox Network across nodes.
GSL
GSL is a library of mathematical methods for C and C++ languages.