DGX system overview

Please note that there is no internet access from any nodes inside the DGX cluster for end users. As such, all work needs to be manually copied by the user onto and off of the cluster using scp, rsync, or sftp methods.

The IBI Nvidia DGX high-performance cluster is a Slurm-based HPC with high-speed interconnects and a shared filesystem.

The nodes included in the cluster are:

Dell PowerEdge R650

Processors

Processor Class

Cores per Node

Nodes in Cluster

Memory per Node (GB)

Network

Node Names

Processors

Processor Class

Cores per Node

Nodes in Cluster

Memory per Node (GB)

Network

Node Names

Intel(R) Xeon(R) Gold 5317 CPU @ 3.00GHz

Icy Lake

24

2

128

Infiniband EDR (100Gbps)

bcm10-0[1-2]

Dell PowerEdge R650

Processors

Processor Class

Cores per Node

Nodes in Cluster

Memory per Node (GB)

Network

Node Names

Processors

Processor Class

Cores per Node

Nodes in Cluster

Memory per Node (GB)

Network

Node Names

Intel(R) Xeon(R) Gold 5317 CPU @ 3.00GHz

Icy Lake

24

2

128

Infiniband EDR (100Gbps)

slogin-0[1-2]

DGX H100: Datasheet

Processors

Processor Class

Cores per Node

Nodes in Cluster

Node RAM (TB)

GPU Type

Total GPUs

GPU RAM (GB)

Network

Node Names

Processors

Processor Class

Cores per Node

Nodes in Cluster

Node RAM (TB)

GPU Type

Total GPUs

GPU RAM (GB)

Network

Node Names

Intel(R) Xeon(R) Platinum 8480CL

Sapphire Rapids

112

5

2

H100

8

80

Infiniband EDR (100Gbps)

dgx-0[1-5]

NFS-mounted filesystem: 300TB Usable

 

Center for Computational Sciences