Cookies

We use cookies to ensure that we give you the best experience on our website. You can change your cookie settings at any time. Otherwise, we'll assume you're OK to continue.

Durham University

COSMA

COSMA job queues

COSMA7 queues: cosma7, cosma7-pauper and cosma7-prince

Access to the DiRAC COSMA7 machine is provided using the three queues cosma7, cosma7-pauper and cosma7-prince. These three queues share all the compute nodes (300 in November 2018, increasing to 452 in February 2019) dispatching jobs according to the different priorities that the queues have assigned. All the queues are configured so that job exclusive access to nodes is enforced. This means that no jobs share a compute node. Therefore, if you only need a single core, your project allocation will still be charged for using 28 cores.

The main use of these queues is for DiRAC projects that have been assigned time at Durham and the mix of jobs expected to match its capabilities are MPI/OpenMP/Hybrid jobs using up to 28 cores per node with a maximum memory of 768Gb per node. If any jobs can run using fewer resources than a single node then they should be packaged into a batch job with appropriate internal process control to scale up to this level.

In addition to the hardware limits the queues have the following limits and priorities:

Nameprioritymaximum run timemaximum cores
cosma7 Normal 72 hours unlimited
cosma7-pauper Low 24 hours unlimited
cosma7-prince Highest unlimited 4096

The three queues share the same resources so the order that jobs run are decided on a number of factors. Higher priority jobs will run first, and in fact jobs in higher priority queues will always run before lower priority jobs, however, it may not superficially seem like that as jobs from lower priority queues may run as back-fills (this is allowed when a lower priority job will complete before the resources needed for a higher one will become available, so setting a run-time limit for your job may get it completed more quickly). See the Durham utilities descriptions for how to make use of back-filling.

The cosma7 and cosma7-pauper queues are available to all users, the -prince queue can only be accessed by special arrangement. Projects that overrun their quarterly allocation will only be allowed access to the cosma7-pauper queue, until the start of the next quarter, and any jobs submitted to cosma7 will be automatically demoted (this is so that the resources are made available to the other projects, not to stop over-budget projects from continuing if resources are available).

Jobs within the same queue are scheduled using a fairshare arrangement so each user initially has the same priority. This is then weighted using a resources used formula. Note that the order of running can again be affected by back-filling (but that will only work if a job is given a run time) and using fewer resources than other jobs.

COSMA6 queues: cosma6, cosma6-pauper and cosma6-prince

Access to the DiRAC COSMA6 machine is provided using the three queues cosma6, cosma6-pauper and cosma6-prince. These three queues share all the 574 compute nodes dispatching jobs according to the different priorities that the queues have assigned. All the queues are configured so that job exclusive access to nodes is enforced. This means that no jobs share a compute node.

The main use of these queues is for DiRAC projects that have been assigned time at Durham and the mix of jobs expected to match its capabilities are MPI/OpenMP/Hybrid jobs using up to 16 cores per node with a maximum memory of 128Gb per node. If any jobs can run using fewer resources than a single node then they should be packaged into a batch job with appropriate internal process control to scale up to this level.

In addition to the hardware limits the queues have the following limits and priorities:

Nameprioritymaximum run timemaximum cores
cosma6 Normal 72 hours unlimited
cosma6-pauper Low 24 hours unlimited
cosma6-prince Highest unlimited 4096

The three queues share the same resources so the order that jobs run are decided on a number of factors. Higher priority jobs will run first, and in fact jobs in higher priority queues will always run before lower priority jobs, however, it may not superficially seem like that as jobs from lower priority queues may run as back-fills (this is allowed when a lower priority job will complete before the resources needed for a higher one will become available, so setting a run-time limit for your job may get it completed more quickly). See the Durham utilities descriptions for how to make use of back-filling.

cosma6 queues are exclusive, meaning no matter how many cores you request per node you will have exclusive use of that node. Your allocation will be reduced assuming that you were using all the cores.

The cosma6 and cosma6-pauper queues are available to all users, the -prince queue can only be accessed by special arrangement. Projects that overrun their quarterly allocation will only be allowed access to the cosma6-pauper queue, until the start of the next quarter, and any jobs submitted to cosma6 will be automatically demoted (this is so that the resources are made available to the other projects, not to stop over-budget projects from continuing if resources are available).

The quarterly allocation and project usage can can be found out in the COSMA usage pages. You will need your COSMA username, and a password or usage token (generated by the cuseagetoken command) to see these.

Jobs within the same queue are scheduled using a fairshare arrangement so each user initially has the same priority. This is then weighted using a resources used formula. Note that the order of running can again be affected by back-filling (but that will only work if a job is given a run time) and using fewer resources than other jobs.

COSMA5 queues: cosma, cosma-pauper and cosma-prince, cordelia, cosma-analyse, cosma-bench

Access to the Durham COSMA5 machine is provided using the six queues cosma, cosma-pauper, cosma-prince, cordelia, cosma-analyse and cosma-bench. The first three queues share all the 302 compute nodes dispatching jobs according to the different priorities that the queues have assigned. All the queues are configured so that job exclusive access to nodes is enforced. This means that no jobs share a compute node.

The cordelia queue should be used for single core jobs, and will share computational resources on a single node with other jobs. This allows efficient use of the cluster. When using the cordelia queue, please specify the maximum memory your job will require, e.g.: #SBATCH --mem=10G will reserve 10GB for you, and allow the other cores to use the rest.

The cosma-analyse queue is for data analysis purposes. The cosma-bench queue is for benchmarking.

The main use of these queues is for Durham projects that have been assigned time at Durham and the mix of jobs expected to match its capabilities are MPI/OpenMP/Hybrid jobs using up to 16 cores per node with a maximum memory of 126GB per node. Note that the nodes in COSMA5 are disk-less so this represents a hard memory limit and exceeding this will cause your job to fail. If any jobs can run using fewer resources than a single node then they should be either submitted to the cordelia queue or packaged into a batch job with appropriate internal process control to scale up to this level.

In addition to the hardware limits the queues have the following limits and priorities:

Nameprioritymaximum run timemaximum cores
cosma Normal 72 hours unlimited
cosma-pauper Low 24 hours unlimited
cosma-prince Highest unlimited

4096

cordelia Normal unlimited

1

cosma-bench Normal 24 hours

16

cosma-analyse Normal 23 hours, 20 minutes(!)

144

The three queues share the same resources so the order that jobs run are decided on a number of factors. Higher priority jobs will run first, and in fact jobs in higher priority queues will always run before lower priority jobs, however, it may not superficially seem like that as jobs from lower priority queues may run as back-fills (this is allowed when a lower priority job will complete before the resources needed for a higher one will become available, so setting a run-time limit for your job may get it completed more quickly). See the Durham utilities descriptions for how to make use of back-filling.

cosma queues are exclusive (except cordelia), meaning no matter how many cores you request per node you will have exclusive use of that node. Your allocation will be reduced assuming that you were using all the cores.

The cosma and cosma-pauper queues are available to all users, the -prince queue can only be accessed by special arrangement.

The quarterly allocation can can be found out in the COSMA usage pages, you'll need your COSMA username, and password or token (generated using the cusagetoken command) to see these.

Jobs within the same queue are scheduled using a fairshare arrangement so each user initially has the same priority. This is then weighted using a resources used formula.

Note that the order of running can again be affected by back-filling (but that will only work if a job is given a run time) and using fewer resources than other jobs.

COSMA logo