We've been experiencing some instability on the clusters (particularly Cardinal and Ascend). 

Technical Specifications

The following are technical specifications for Quad GPU nodes.  

Number of Nodes

24 nodes

Number of CPU Sockets

48 (2 sockets/node)

Number of CPU Cores

2,304 (96 cores/node)

Cores Per Node

96 cores/node (88 usable cores/node)

Internal Storage

12.8 TB NVMe internal storage

Compute CPU Specifications
AMD EPYC 7643 (Milan) processors for compute
  • 2.3 GHz
  • 48 cores per processor
Computer Server Specifications

24 Dell XE8545 servers

Accelerator Specifications

4 NVIDIA A100 GPUs with 80GB memory each, supercharged by NVIDIA NVLink

Number of Accelerator Nodes

24 total

Total Memory
~ 24 TB
Physical Memory Per Node

1 TB

Physical Memory Per Core

10.6 GB

Interconnect

Mellanox/NVIDA 200 Gbps HDR InfiniBand​

 

The following are technical specifications for Dual GPU nodes.  

Number of Nodes

250 nodes

Number of CPU Sockets

64 (2 sockets/node)

Number of CPU Cores

35,072 (128 cores/node)

Cores Per Node

128 cores/node (120 usable cores/node)

Internal Storage

1.92 TB NVMe internal storage

Compute CPU Specifications
2 AMD EPYC 7H12 processors for compute
  • 2.60 GHz
  • 64 cores per processor
Computer Server Specifications

250  Dell R7525 servers

Accelerator Specifications
2 NVIDIA A100 GPUs with 40GB memory each, PCIe, 250W
 
Number of Accelerator Nodes

500 total

Total Memory
~ 125 TB
Physical Memory Per Node

0.5 TB

Physical Memory Per Core

4 GB

Interconnect

HDR100 Infiniband (100 Gbps)​

Supercomputer: