General Revised Procedure for Job Submission

SERIAL

Script for a single core serial job

#!/bin/bash

#SBATCH -J job_name                # name of the job

#SBATCH -p shared                  # name of the partition: available options "shared"

#SBATCH -n 1                       # no of processes

#SBATCH -t 01:00:00                # walltime in HH:MM:SS, Max value 72:00:00

 

#list of modules you want to use, for example

#module load apps/lammps/12.12.2018/intel

 

#name of the executable

exe=name_executable 


#run the application

$exe                               # specify the application command-line options, if any, after $exe

OpenMP

Script for a single node openmp job

#!/bin/bash

#SBATCH -J job_name                # name of the job

#SBATCH -p shared                  # name of the partition: available options "shared"

#SBATCH -N 1                      # no of nodes

#SBATCH -n 1                       # no of processes or tasks

#SBATCH --cpus-per-task=10         # no of threads per process or task

#SBATCH -t 01:00:00                # walltime in HH:MM:SS, Max value 72:00:00

 

#list of modules you want to use, for example

#module load apps/lammps/12.12.2018/intel

 

#name of the executable

exe=name_executable 


export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK


#run the application

$exe                               # specify the application command-line options, if any, after $exe

MPI

Script for a Multinode MPI job

#!/bin/bash

#SBATCH -J job_name                # name of the job

#SBATCH -p medium                  # name of the partition: available options "large,medium"

#SBATCH -N 2                      # no of nodes

#SBATCH --ntasks-per-node=40                      # ntasks per node

#SBATCH -t 01:00:00                # walltime in HH:MM:SS, Max value 72:00:00 for medium, and 168:00:00 for large

 

#list of modules you want to use, for example

#module load apps/lammps/12.12.2018/intel


#name of the executable

exe=name_executable

 

#run the application

mpirun -bootstrap slurm -n $SLURM_NTASKS $exe      # specify the application command-line options, if any, after $exe

** Please try to use No. of MPI Tasks (i.e $SLURM_NTASKS) multiple of 40 to avoid Node sharing for different users and to increase the overall utilization of cluster.

 

MPI+OpenMP(Hybrid)

Script for a Multinode MPI+OpenMP(Hybrid) job

#!/bin/bash

#SBATCH -J job_name                # name of the job

#SBATCH -p medium            # name of the partition: available options "medium, large"

#SBATCH -N 2                      # no of nodes

#SBATCH --ntasks-per-node=40                      # ntasks for node

#SBATCH --cpus-per-task=1          # number of threads per task - used for OpenMP

#SBATCH -t 01:00:00                # walltime in HH:MM:SS, Max value 72:00:00 for medium, and 168:00:00 for large

 

#list of modules you want to use, for example

#module load apps/lammps/12.12.2018/intel

 

#name of the executable

exe=name_executable


export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK


#run the application

mpirun -bootstrap slurm -n $SLURM_NTASKS $exe    # specify the application command-line options, if any, after $exe

 

GPU+MPI+OpenMP

Script for GPU+MPI+OpenMP job

#!/bin/bash

#SBATCH -J job_name                # name of the job

#SBATCH -p gpu                     # name of the partition: available options "gpu"

#SBATCH -N 1                       # no of Nodes

#SBATCH -n 4                       # no of processes or tasks

#SBATCH --gres=gpu:1               # request gpu card: it should be either 1 or 2

#SBATCH --cpus-per-task=4          # no of threads per process or task

#SBATCH -t 01:00:00                # walltime in HH:MM:SS, max value 72:00:00

 

#list of modules you want to use, for example

#module load apps/lammps/12.12.2018/cuda9.2/gpu

 

#name of the executable

exe=name_executable

 

export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK

 

#run the application

mpirun -bootstrap slurm -n $SLURM_NTASKS $exe      #specify the application command-line options, if any, after $exe

General procedure to prepare the batch script and submit the job

PARAM Shakti extensively uses modules. The purpose of module is to provide the production environment for a given application/compiler/library. This also specifies which version of the application/compiler/library is available for a given session. All of them are made available through module files. A user must load the appropriate module(s) from the availablemodules before launching the job through slurm scripts . 

module avail                                       # This command lists all the available modules

module load intel/2018.0.1.163         # This will load the intel compilers into your environment

module unload intel/2018.0.1.163   # This will remove all environment setting related to intel-2018 compiler loaded previously     

module show intel/2018.0.1.163   # Displays information about intel module, including environment changes                              

A simple Slurm job script (This is a sample script to demonstrate different parameters)

#!/bin/bash

 

#SBATCH -J lammps-mpi            #Job name(--job-name)

#SBATCH -o %j-out-myjob          #Name of stdout output file(--output)

#SBATCH -e %j-err-myjob          #Name of stderr error file(--error)

#SBATCH -p shared              #Queue (--partition) name ; available options "shared,medium,large or gpu" 

#SBATCH -n 3                    #Total Number of mpi tasks (--ntasks .should be 1 for serial)

#SBATCH -c 2                    #(--cpus-per-task) Number of Threads

#SBATCH -t=00:10:00         # specifies walltime(--time maximum duration)of run

#SBATCH --mem=23000        # specifies Minimum 23000MB Memory required per node in MB if no unit specified. It is optional. 

#SBATCH --mem-per-cpu=6000M        # specifies Minimum 6000MB memory required per allocated CPU. It is optional. 

#SBATCH --mail-user=abc@iitkgp.ac.in        # user's email ID where job status info will be sent

#SBATCH --mail-type=ALL        # Send Mail for all type of event regarding the job

 

# Thread Reservation Section

if [ -n "$SLURM_CPUS_PER_TASK" ]; then

             omp_threads=$SLURM_CPUS_PER_TASK

           else

             omp_threads=1

fi

 

# Load Application/Compiler/Library modules, for example

# module load apps/lammps/7Aug19/intel

   

# name of the executable

exe=name_executable


export OMP_NUM_THREADS=$omp_threads

    

# Executables along with right Parameters and right input files

mpirun -bootstrap slurm -n $SLURM_NTASKS $exe                 

 Parameters used in SLURM job script

The job flags are used with SBATCH command.  The syntax for the SLURM directive in a script is "#SBATCH <flag>".  Some of the flags are used with the srun and salloc commands. To Know more about the parameters refer Man Page "man sbatch" .

Resource

Flag Syntax

Description

partition

--partition=partition name (or) -p

Partition is a queue for jobs.

time

--time=01:00:00 (or) -t

Time limit for the job.

Total No. of MPI tasks

--ntasks=8 (or) -n 8

Corresponds to the total MPI tasks required by the job .

Number of threads per MPI task

--cpus-per-task=4 (or) -c 4

Corresponds to the cores per MPI task.

GPU

--gres=gpu:N where N=1 or 2

Request use of GPUs on compute nodes

error file

--error=err.out

Job error messages will be written to this file.

job name

--job-name="lammps"

Name of job.

output file

--output=lammps.out

Name of file for stdout.

email address

--mail-user=username@iitkgp.ac.in

User's email address

memory

--mem=2300

Memory required per node in MB

Submit the job
   

Submitting a simple standalone job

*User must use $SCRATCH directory for submitting the jobs (use "cd $SCRATCH" command to navigate to user's scratch directory)

This is a simple submit script named "slurm-job.sh" which is to be submitted

$ sbatch slurm-job.sh
  Submitted batch job 106

$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
      150 standard   simple  user1   R   0:31  1 atom01

Sample batch scripts available at "/home/iitkgp/slurm-scripts".For more information on slurm usage, refer to https://slurm.schedmd.com/documentation.html