High Performance Computing

DC installations

The DC has a rotation ventilation system with built-in redundancy, it also has a UPS able to maintain the electricity supply for all servers for several minutes (30+) and multiple power supply lines for each rack. As a protection against fire the CPD has an automatic trifluoromethane gas extinction system.

Calculation cluster

16 calculation nodes giving a total of:

  • 240 Nuclei
  • 2 Tb  RAM
  • Tb local storage (1Tb/node)

Made up of three different types of nodea:

  • HP DL160 G6 : 8 Nuclei (Xeon E5640)  and 64 GB RAM (x8)
  • Fujitsu RX200 S7: 12 Nuclei (Xeon E5-2640) and 120 GB RAM (x3)
  • HP DL 360 G9: 28 Nuclei (Xeon E5-2680 v4) and 256 GB RAM (x5)

A GPU server comprised of 4 graphic accelerators with a total of:

  • 27,648 CUDA Cores
  • 1TB of RAM
  • 7TB of fast storage

Each NVIDIA A100-SXM4-40GB graphic accelerator features:

  • 6,912 CUDA FP32 cores
  • 3,456 CUDA FP64 cores
  • 432 Tensor Cores

Storage system

The main storage system is a high availabililty Isilon (Dell/EMC) Gen 6 with four H400 nodes and four A200 nodes.

  • Usable storage capacity 450Tb Snapshot system available to users
  • Sistema de snapshots accesibles para los usuarios

The system provides a unified file system for all users and available from the compute nodes or from workstations, simplifying work for researchers.

Backup/Data Recovery

The backup/data-recovery system is made up of:

  • HPE 4048 library with two LTO-7 drives and capacity for 48 tapes (768 Tbs capacity)
  • Quantum Scalar i80 two drives LTO-5 with capacity for 50 tapes Both libraries are managed by Bacula backup software

The principle library (HPE 4048) is for security copies fo data for groups, virtual machines, data bases and data stored long-term.

The secondary library (Quantum Scalar i80) deals with redundant copies of virtual mahcines and work archived on LTO-5 tape.

The system makes daily backups of data and with a default 3 month data retention policy. The archive backup jobs are kept indefinitely.

Network

The DC network is made up of 3 switches of 1Gb 3com 4210G and 3 switches of 10Gb HPE 5700. The switches and connections to the machines are configured for high availability. The Isilon storage system, main backup system and DL 360 compute nodes are connected through 10GB, the rest of the cluster uses 1Gb with aggregated interfaces.