Māui Slurm Partitions

Tip

Partitions on these systems that may be used for NeSI workloads carry the prefix nesi_.

Māui (XC50) Slurm Partitions¶

Nodes are not shared between jobs on Māui, so the minimum charging unit is node-hours, where 1 node-hour is 40 core-hours, or 80 Slurm CPU-hours.

There is only one partition available to NeSI jobs:

Name	Nodes	Max Walltime	Avail / Node	Max / Account	Description
nesi_research	316	24 hours	80 CPUs 90 or 180 GB RAM	240 nodes 1200 node-hours running	Standard partition for all NeSI jobs.

Limits¶

As a consequence of the above limit on the node-hours reserved by your running jobs (GrpTRESRunMins in Slurm documentation, shown in squeue output when you hit it as the reason "AssocGrpCPURunMinutes" ) you can occupy more nodes simultaneously if your jobs request a shorter time limit:

nodes	hours	node-hours	limits reached
1	24	24	24 hours
50	24	1200	1200 node-hours, 24 hours
100	12	1200	1200 node-hours
240	5	1200	1200 node-hours, 240 nodes
240	1	240	240 nodes

Most of the time job priority will be the most important influence on how long your jobs have to wait - the above limits are just backstops to ensure that Māui's resources are not all committed too far into the future, so that debug and other higher-priority jobs can start reasonably quickly.

Debug QoS¶

Each job has a "QoS", with the default QoS for a job being determined by the allocation class of its project. Specifying --qos=debug will override that and give the job high priority, but is subject to strict limits: 15 minutes per job, and only 1 job at a time per user. Debug jobs are limited to 2 nodes.

Māui_Ancil (CS500) Slurm Partitions¶

Name	Nodes	Max Walltime	Avail / Node	Max / Job	Max / User	Description
nesi_prepost	4	24 hours	80 CPUs 720 GB RAM	20 CPUs 700 GB RAM	80 CPUs 700 GB RAM	Pre and post processing tasks.
nesi_gpu	4 to 5	72 hours	4 CPUs 12 GB RAM 1 P100 GPU*	4 CPUs 12 GB RAM 1 P100 GPU	4 CPUs 12 GB RAM 1 P100 GPU	GPU jobs and visualisation.
nesi_igpu	0 to 1	2 hours	4 CPUs 12 GB RAM 1 P100 GPU*	4 CPUs 12 GB RAM 1 P100 GPU	4 CPUs 12 GB RAM 1 P100 GPU	Interactive GPU access 7am - 8pm.

* NVIDIA Tesla P100 PCIe 12GB card

Requesting GPUs¶

Nodes in the nesi_gpu partition have 1 P100 GPU card each. You can request it using:

#SBATCH --partition=nesi_gpu
#SBATCH --gpus-per-node=1

Note that you need to specify the name of the partition. You also need to specify a number of CPUs and amount of memory small enough to fit on these nodes.

See GPU use on NeSI for more details about Slurm and CUDA settings.