Running Workflows at CSC

All material (C) 2021-2024 by CSC -IT Center for Science Ltd. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported License, http://creativecommons.org/licenses/by-sa/4.0/

Outline

  • Methods of running (bio)workflows at CSC
  • Good practices for running high-throughput workflows

Reminder: Submitting workflow jobs at CSC supercomputers

  • Login nodes are used to set up jobs (and to launch them from batch scripts)
    • Don’t launch workflows on login nodes
  • Jobs are run on the compute nodes
    • Interactive nodes can be used as well
    • Singularity/Apptainer is installed on all (compute and login) nodes
  • Slurm batch scheduler is used to run and manage jobs

Reminder: Managing batch jobs

  • A batch job script is submitted to the queue with the command:
    • sbatch example_job.sh
  • List all your jobs that are queuing/running:
    • squeue -u $USER
  • Detailed info of a queuing/running job:
    • scontrol show job <jobid>
  • A job can be deleted using the command:
    • scancel <jobid>
  • Display the resource usage and efficiency of a completed job:
    • seff <jobid>

Methods of running workflows at CSC

  • Deploy workflows with native slurm executor
    • Jobs can spread across full cluster
    • Pay attention to overheads (Slurm accounting DB/Batch Queueing)
  • Submit workflows as a normal batch jobs
    • Can request full node
  • Deploy workflows using HyperQueue as a sub-job scheduler
    • Can use multiple nodes

Running Nextflow using native slurm executor

In nextflow.config file, one can have the following exerpt:

profiles {

 standard {  
     process.executor = 'local' 
   }

 puhti {  
     process.clusterOptions = '--account=project_xxxx --ntasks-per-node=1 --cpus-per-task=4 --ntasks=1 --time=00:00:05'
     process.executor = 'slurm'
     process.queue = 'small'
     process.memory = '10GB'   
    }
    
}

Usage:
> nextflow run -profile puhti

Wrapping Nextflow pipeline as a (normal) batch job

  • One can request more resources if needed
  • All rules are run in the same job allocation
#!/bin/bash
#SBATCH --time=00:15:00            # Change your runtime settings
#SBATCH --partition=test           # Change partition as needed
#SBATCH --account=<project>        # Add your project name here
#SBATCH --cpus-per-task=<value>    # Change as needed
#SBATCH --mem-per-cpu=1G           # Increase as needed

# Load Nextflow module
module load nextflow/22.10.1

# Actual Nextflow command here
nextflow run workflow.nf <options>

Running Nextflow using HyperQueue executor

  • Multiple nodes can be deployed under the same job allocation
  • No need for queueing subjobs more on HyperQueue
# Specify a location for the HyperQueue server
export HQ_SERVER_DIR=${PWD}/hq-server-${SLURM_JOB_ID}
mkdir -p "${HQ_SERVER_DIR}"

# Start the server in the background (&) and wait until it has started
hq server start &
until hq job list &>/dev/null ; do sleep 1 ; done

# Start the workers in the background and wait for them to start
srun --overlap --cpu-bind=none --mpi=none hq worker start --cpus=${SLURM_CPUS_PER_TASK} &
hq worker wait "${SLURM_NTASKS}"

# Ensure Nextflow uses the right executor and knows how much it can submit
echo "executor {
  queueSize = $(( 40*SLURM_NNODES ))
  name = 'hq'
  cpus = $(( 40*SLURM_NNODES ))
}" >> nextflow.config

nextflow run <newtflow.nf> <options>

# Wait for all jobs to finish, then shut down the workers and server
hq job wait all
hq worker stop all
hq server stop

Getting started with Snakemake at CSC

  • Use pre-installed Snakemake as a module:
    • Puhti: module load snakemake/version
    • LUMI :
      • module use /appl/local/csc/modulefiles/
      • module load snakemake/8.4.6
  • Do your own installations
  • Install your application stack:
    • Local installations (as modules or custom installations)
    • Docker engine (Not possible)
    • Singularity/Apptainer
    • Conda (Not supported at CSC)

Deploying Snakemake with native slurm executor

  • Syntax for slurm executor depends on Snakemake version/plugin
  • Submits a job to cluster for each rule
  • Not for large number of small sub-job steps (<30 min)
module load snakemake/8.4.6

snakemake -s Snakefile --jobs 1 \
 --latency-wait 60 \
 --executor cluster-generic \
 --cluster-generic-submit-cmd "sbatch --time 10 \
 --account=project_xxxx --job-name=hello-world \
 --tasks-per-node=1 --cpus-per-task=1 --mem-per-cpu=4000 --partition=test"

 or

snakemake --jobs 1  -s Snakefile \
--executor slurm --default-resources \
slurm_account=project_xxxx slurm_partition=test

Submit Snakemake as a batch job

  • One can request more resources if needed
  • All rules are run in the same job allocation
#!/bin/bash
#SBATCH --job-name=myTest
#SBATCH --account=project_xxxxx
#SBATCH --time=00:10:00
#SBATCH --mem-per-cpu=2G
#SBATCH --partition=test
#SBATCH --cpus-per-task=4

module load snakemake/8.4.6
snakemake -s Snakefile --use-singularity --jobs 4

Running Snakemake using HyperQueue executor

Good practices for running HT workflows (1/3)

  • Avoid unnecessary reads and writes of data on Lustre file system to improve I/O performance
    • If unavoidable, use fast local NVMe disk, not Lustre (i.e. /scratch)
  • Don’t run too many/short job steps – they will bloat Slurm accounting DB
    • Avoid slurm scheduler
  • Don’t run too long jobs without a restarting option.

Good practices for running HT workflows (2/3)

  • Don’t use Conda installations on Lustre (/projappl, /scratch, $HOME)
    • Containerize Conda environments instead to improve performance
  • Don’t create a lot of files, especially within a single folder
  • If you’re creating 10 000+ files, you should probably rethink your workflow
  • Consider removing temporary files after job is finished
  • Whenever possible, separate serial jobs from parallel ones for efficient usage of resourses.

Good practices for running HT workflows (3/3)

  • Use version control of tools for reproducibility
  • Use containers for easy portability
  • Set singularity/Apptainer cache directory to scratch folder (to avoid blowing up home directory)
  • Avoid big databases on Lustre or any databases inside of a container
  • If you are downloading lot of data on the fly in your workflow, try to stage it locally.