Queuing Systems & Scheduler

Error message

Deprecated function: Function create_function() is deprecated in eval() (line 1 of /home/iitgnac/public_html/hpc/modules/php/php.module(80) : eval()'d code).
Deprecated function: The each() function is deprecated. This message will be suppressed on further calls in menu_set_active_trail() (line 2405 of /home/iitgnac/public_html/hpc/includes/menu.inc).

Queuing Systems

When a job is submitted, it is placed in a queue. There are different queues available for different purposes. The user must select any one of the queues from the ones listed below which is appropriate for his/her computation need.

Queue	Details
Queue for submitting serial jobs: The queue will be available to all the HPCC users. The total number concurrent jobs allowed in this queue are limited to 10 and a single user can submit at most 2 concurrent jobs.	Name of Queue = serial No of nodes = 1 No of x86 Processors = 1 Name of node = gpu1 Walltime = 24:00:00 (HH:MM:SS) MaxJob = 2 per user
Queue for submitting parallel jobs: The queue will be available to all the HPCC users and the jobs that use parallel computation shall only be allowed. The maximum number of cores per job is limited to 96.	Name of Queue = main No of nodes = 10 No of x86 Processors = 160 Name of node = node{1,2,3,4,5,6,7,8,9,10} Walltime = 48:00:00 (HH:MM:SS) MaxJob = 2 per user
Queue for submitting GPU jobs: The queue will be available to all the HPCC users and the jobs that utilize GPUs shall only be allowed.	Name of Queue = gpu No of nodes = 2 No of x86 Processors = 32 Cuda cores = 2688 Name of node = gpu1 & gpu2 Walltime = 72:00:00 (HH:MM:SS) MaxJob = 2 per user
Special queues: The purpose of the queue is to give priority to jobs of the faculty, and their students, who make the compute nodes available (during spare time) for general use; this queue will be available to the members of advisor group only.	Name of Queue = <advisor> No of nodes = 2 No of x86 Processors = 32 Name of node = node{9,10} Walltime = 72:00:00 (HH:MM:SS) MaxJob = infinite

Node Configuration

Based on the queuing system given above, the node configurations can be summarized as follows:

Node name	Node type	Queue assignment	Queue priority
node1 to node8	Compute	main	main
gpu1	GPU	serial, gpu	serial
gpu2	GPU	main, gpu	gpu
node9 and node10	Compute	main, <advisor>	<advisor>

Scheduler Details: We are using SLURM with version 14.03.7

: (colon)	Indicates a commented-out line that should be ignored by the scheduler.
#SBATCH	Indicates a special line that should be interpreted by the scheduler.
srun ./hello_parallel	This is a special command used to execute MPI programs. The command uses directions from SLURM to assign your job to the scheduled nodes.
--job-name=hello_serial	This sets the name of the job; the name that shows up in the "Name" column in squeue's output. The name has no significance to the scheduler, but helps make the display more convenient to read.
--output=slurm.out --error=slurm.err	This tells SLURM where it should send your job's output stream and error stream, respectively. If you would like to prevent either of these streams from being written, set the file name to /dev/null
--partition=batch	Set the partition in which your job will run.
--qos=normal	Set the QOS in which your job will run.
--nodes=4	Request four nodes.
--ntasks-per-node=8	Request eight tasks to be run on each node. The number of tasks may not exceed the number of processor cores on the node.
--time=1-12:30:00	This option sets the maximum amount of time SLURM will allow your job to run before it is automatically killed. In the example shown, we have requested 1 day, 12 hours, 30 minutes, and 0 seconds. Several other formats are accepted such as "HH:MM:SS" (assuming less than a day). If your specified time is too large for the partition/QOS you've specified, the scheduler will not run your job.
--mem-per-cpu=MB	Specify a memory limit for each process of your job. The default is 2944
--mem=MB	Specify a memory limit for each node of your job. The default is that there is a per-core limit
--exclusive	Specify that you need exclusive access to nodes for your job. This is the opposite of "--share".
--share	Specify that your job may share nodes with other jobs. This is the opposite of "--exclusive".
--constraint=feature_name	Tell the scheduler that scheduled nodes for this job must have feature "feature_name"
--gres=resource_name	Tell the scheduler that scheduled nodes for this job will use resource "resource_name"

Sample Scripts to submit job for various queue:

Serial

#!/bin/bash
#SBATCH --job-name=<myjob>
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --error=myjob.%J.err
#SBATCH --output=myjob.%J.out
#SBATCH --partition=serial #SBATCH -v

cd ~/<your-path>
MACHINEFILE=machinefile
scontrol show hostname $SLURM_JOB_NODELIST > $MACHINEFILE
<your path to binary> -batch -np 1 -machinefile $MACHINEFILE -rsh /usr/bin/ssh ~/<your-path>/input-file

Main

#!/bin/bash
#SBATCH --job-name=<myjob>
#SBATCH --nodes=6
#SBATCH --ntasks-per-node=16
#SBATCH --error=myjob.%J.err
#SBATCH --output=myjob.%J.out
#SBATCH --partition=main
#SBATCH -v

cd ~/<your-path>
MACHINEFILE=machinefile
scontrol show hostname $SLURM_JOB_NODELIST > $MACHINEFILE
<your path to binary> -batch -np 96 -machinefile $MACHINEFILE -rsh /usr/bin/ssh ~/<your-path>/input-file

GPU

#!/bin/bash
#SBATCH --job-name=<myjob>
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --gres=gpu:1
#SBATCH --error=myjob.%J.err
#SBATCH --output=myjob.%J.out
#SBATCH --partition=gpu
#SBATCH -v

cd ~/<your-path>
MACHINEFILE=machinefile
scontrol show hostname $SLURM_JOB_NODELIST > $MACHINEFILE
<your path to binary> -batch -np 32 -machinefile $MACHINEFILE -rsh /usr/bin/ssh ~/<your-path>/input-file

Advisor

#!/bin/bash #SBATCH --job-name=<myjob>
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --error=myjob.%J.err
#SBATCH --output=myjob.%J.out
#SBATCH --partition=<advisor>
#SBATCH -v

cd ~/<your-path>
MACHINEFILE=machinefile
scontrol show hostname $SLURM_JOB_NODELIST > $MACHINEFILE
<your path to binary> -batch -np 32 -machinefile $MACHINEFILE -rsh /usr/bin/ssh ~/<your-path>/input-file

Useful Commands

For submitting a job: sbatch submit_script.sh
For checking queue status: squeue -l
For checking node status: sinfo
For cancelling the job: scancel <job-id>

Usage Guidelines

Users are supposed to submit jobs only through scheduler.
Users are not supposed to run any job on the master node.
Users are not allowed to run a job by directly login to any compute node.

Error message

Serial

Main

GPU

Advisor

Main menu 2

CC Footer

Main menu

You are here

Error message

Serial

Main

GPU

Advisor

Main menu 2

CC Footer