Welcome to the COSmology MAchine (COSMA)! Here, you will find some information about the COSMA5, COSMA6 and COSMA7 HPC facilities.
Watch the DellEMC COSMA7 video here.
COSMA has been in existence since July 2001 and is now in its 7th generation.
COSMA5 is a Durham system for ICC users and collaborators.
COSMA6 and COSMA7 are the current DiRAC facilities hosted by Durham. COSMA6 was introduced on 1st April 2017 as the DiRAC-2.5 Data Centric service. COSMA7 was introduced as the DiRAC-2.5x system in May 2018, and expanded by DiRAC-2.5y in January 2019, and as DiRAC-2.5z in April 2019.
COSMA8 is a new DiRAC system which will become operational in September 2020.
DiRAC is the UK's integrated supercomputing facility for theoretical modelling and HPC-based research in particle physics, astronomy and cosmology. For more information about DiRAC please visit the DiRAC web pages http://www.dirac.ac.uk.
The actual load (taken every 15 minutes) on the DiRAC systems can be seen on the SAFE pages: http://dirac-ops.epcc.ed.ac.uk/
The COSMA systems run CentOS 7.6, with a 3.10 Linux kernel.
The login details are:
cosma5: login.cosma.dur.ac.uk or login5.cosma.dur.ac.uk (round-robin to login5a and login5b)
cosma6: login6.cosma.dur.ac.uk (single node, login6a)
cosma7: login7.cosma.dur.ac.uk (round-robin to login7a, login7b and login7c)
All systems use a global slurm batch system. Submission to any queue (cosma, cosma6 and cosma7) can be done from any login node.
COSMA5 is no longer a DiRAC facility. Therefore, DiRAC users should use COSMA6 or 7.
COSMA has 3 periods of scheduled downtime per year, lasting up to a week, though typically the affected period is shorter. Current scheduled periods are:
5-9th October 2020
1-5th February 2021
7-11th June 2021
Ogden Centre for Fundamental Physics - West,
Department of Physics,
Durham DH1 3LE
5/10/20: COSMA downtime starts
21/9/20: COSMA8 Compute nodes powered up, with novel on-chip cooling
7/8/20: COSMA8 service nodes brought into production
26/6/20: COSMA5 back in operation
24/6/20: COSMA5 down - failure of machine room CIS cooling equipment
3/6/20: COSMA downtime completed
15/5/20: COSMA seems to have survived (so far) the world-wide HPC attacks
14/5/20: All users must regenerate SSH keys and upload to SAFE
17/4/20: GCC 9.3 and Intel 2020 (update 1) compilers now available for use
1/4/20: BlueField cluster available for users (first 4 nodes)
16/3/20: x2go installed on login nodes to aid remote working during COVID-19
2/3/20: 16-node BlueField delivered and racked (awaiting power cables)
28/2/20: New database server for virgodb delivered.
5/2/20: Permanent host for V100 GPU cards identified.
5/2/20: New COSMA5 storage online - from nearly 30kW down to 1.5kW
5/2/20: COSMA is alive again!
3/2/20: COSMA in downtime... back soon
25/11/19: New COSMA6 storage in service across all of COSMA
19/11/19: New COSMA6 storage in service on COSMA6 nodes
11/11/19: New COSMA6 storage migration ongoing