JOB Scheduling(revised)


This information is applicable for IITKGP internal users. Other users in NSMapp, NSMext, etc, are requested to contact CDAC for information on storage, charging policy and queing priority etc.  

Scheduler

PARAM Shakti has Slurm-22.05.09 (open source) as a workload manager for HPC facility. 

Following partitions/queues have been defined for different requirements.

Partition Min-Max cores/ nodes per job Max walltime Priority Comments
shared 1-36 cores     03 days 200 This partition consists only Compute Nodes.                      Node sharing allowed between different jobs running in this partition
medium 1 to 10 nodes 03 days 200 This partition consists Compute and HighMemory Nodes. Multiple of nodes should be requested. Node sharing is not allowed between different jobs.
large 1 to 10 nodes 07 days 10 This partition consists Compute and HighMemory Nodes. Multiple of nodes should be requested. Node sharing not allowed between different jobs.
gpu This partition consists only GPU Nodes. No change - same as the existing configuration

1. shared partition: This partition is designed for serial and openMP jobs. Users can request a minimum of 1 core and a maximum of 36 cores. The maximum walltime for jobs in this partition is 3 days. 

2. medium partition: This partition is suitable for both single-node and multi-node jobs. When submitting a job to this partition, the entire node is allocated exclusively to the job. The minimum and maximum number of nodes that can be requested under this partition are 1 and 10 nodes, respectively. It's important to note that the parameter "--ntasks-per-node=40" should not be changed while submitting a job to this partition. The maximum walltime for jobs in this partition is 3 days. 

3. large partition: Similar to the medium partition, this partition allows for both single-node and multi-node jobs. However, it has a maximum walltime of 7 days and operates at a lower priority. Unless a job specifically requires a walltime exceeding 3 days, it is recommended not to use this partition due to its lower priority. 

4. gpu partition: The configuration for the GPU partition remains the same as before. This partition is dedicated to jobs that require GPU resources. 

**To access high memory nodes with more than 4.3 GB per core memory, you can utilize the option "--exclude=cn[085-384]" or "--mem-per-cpu=AAG" while submitting your job. Please replace "AA" with a number between 4 and 18 based on your specific memory requirement. This option will allow you to allocate the necessary amount of memory in GB per core for your job. 

Please refer to "Usage Charges" tab to request for High priority Ques

 

*The charging policy is subject to change.

 

Storage policy

**Users must submit jobs from /scratch/$USER directory.**

 FileSystem

Quota

Retention Period*

/home

Soft Limit=40G    

Hard Limit=50G

* except for UG students

Unlimited

/scratch

Soft Limit=2.0T    

60 days

*Files older than retention period will be deleted

**User's can check their /home and /scratch quota using "myquota" command

Users need to have the data and application related to their project/research work on PARAM Shakti.

To store the data special directories have been made available to the users with name “scratch and home” the path to this directory is “/scratch” and “/home”. Whereas these directories are common to all the users, a user will get his own directory with their username in /scratch/ as well as /home/ directories where they can store their data.

/home/<username>/: ! This directory is generally used by the user to install applications.

 

/scratch/<username>/: ! This directory is user to store the user data related to the project/research.

However, there is limit to the storage provided to the users, the limits have been defined according to quota over these directories, all users will be allotted same quota by default. When a user wishes to transfer data from their local system (laptop/desktop) to HPC system, they can use various methods and tools.

A user using ‘Windows’ operating system will get methods and tools that are native to Microsoft windows and tools that could be installed on your Microsoft windows machine. Linux operating system users do not require any tool. They can just use “scp” command on their terminal, as mentioned in File Transfers section.

Users are advised to keep a copy of their data with themselves, once the project/research work is completed by transferring the data from PARAM Shakti to their local system (laptop/desktop). The command shown in File Transfers section can be used for effecting file transfers.