1. Introduction:

SLURM (Simple Linux Utility for Resource Management) is a highly scalable and fault-tolerant cluster manager and job scheduling system that can be used in large computing node clusters and is widely used by supercomputers and computing clusters worldwide. SLURM allocates resources reasonably to the task queue and monitors the job until it completes.
See: 链接

2. Common commands:

image.png

3. Assignment submission

3-1: Interactive mode

srun -n 4 ./example

3-2: Batch mode

#! /bin/env bash # file: example.sh # set the number of nodes #SBATCH --nodes=2 # set the number of tasks (processes) per node #SBATCH --ntasks-per-node=4 # set partition #SBATCH --partition=example-partition # set max wallclock time #SBATCH --time=2:00:00 # set name of job #SBATCH --job-name=example-mpi4py # set batch script's standard output #SBATCH --output=example.out

Assignment submission:

sbatch example.sh

3-3: Distribution mode

salloc -N 2 -n 4 -p example-partition -t 100 /bin/bash