Māui Slurm Partitions
Tip
Partitions on these systems that may be used for NeSI workloads carry
the prefix nesi_
.
Māui (XC50) Slurm Partitions¶
Nodes are not shared between jobs on Māui, so the minimum charging unit is node-hours, where 1 node-hour is 40 core-hours, or 80 Slurm CPU-hours.
There is only one partition available to NeSI jobs:
Name | Nodes | Max Walltime | Avail / Node | Max / Account | Description |
nesi_research | 316 | 24 hours | 80 CPUs 90 or 180 GB RAM | 240 nodes 1200 node-hours running | Standard
partition for all NeSI jobs. |
Limits¶
As a consequence of the above limit on the node-hours reserved by your
running jobs (GrpTRESRunMins in Slurm documentation, shown in squeue
output when you hit it as the reason "AssocGrpCPURunMinutes" ) you can
occupy more nodes simultaneously if your jobs request a shorter time
limit:
nodes | hours | node-hours | limits reached |
---|---|---|---|
1 | 24 | 24 | 24 hours |
50 | 24 | 1200 | 1200 node-hours, 24 hours |
100 | 12 | 1200 | 1200 node-hours |
240 | 5 | 1200 | 1200 node-hours, 240 nodes |
240 | 1 | 240 | 240 nodes |
Most of the time job priority will be the most important influence on how long your jobs have to wait - the above limits are just backstops to ensure that Māui's resources are not all committed too far into the future, so that debug and other higher-priority jobs can start reasonably quickly.
Debug QoS¶
Each job has a "QoS", with the default QoS for a job being determined by
the allocation class
of its project. Specifying --qos=debug
will override that and give the
job high priority, but is subject to strict limits: 15 minutes per
job, and only 1 job at a time per user. Debug jobs are limited to 2
nodes.
Māui_Ancil (CS500) Slurm Partitions¶
Name | Nodes | Max Walltime | Avail / Node | Max / Job | Max / User | Description |
nesi_prepost | 4 | 24 hours | 80 CPUs 720 GB RAM | 20 CPUs 700 GB RAM | 80 CPUs 700 GB RAM | Pre and post processing tasks. |
nesi_gpu | 4 to 5 | 72 hours | 4 CPUs 12 GB RAM 1 P100 GPU* | 4 CPUs 12 GB RAM 1 P100 GPU | 4 CPUs 12 GB RAM 1 P100 GPU | GPU jobs and visualisation. |
nesi_igpu | 0 to 1 | 2 hours | 4 CPUs 12 GB RAM 1 P100 GPU* | 4 CPUs 12 GB RAM 1 P100 GPU | 4 CPUs 12 GB RAM 1 P100 GPU | Interactive GPU access 7am - 8pm. |
* NVIDIA Tesla P100 PCIe 12GB card
Requesting GPUs¶
Nodes in the nesi_gpu
partition have 1 P100 GPU card each. You can
request it using:
#SBATCH --partition=nesi_gpu
#SBATCH --gpus-per-node=1
Note that you need to specify the name of the partition. You also need to specify a number of CPUs and amount of memory small enough to fit on these nodes.
See GPU use on NeSI for more details about Slurm and CUDA settings.