Brief introduction to HPC environments

All materials (c) 2020-2024 by CSC – IT Center for Science Ltd. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported License, http://creativecommons.org/licenses/by-sa/4.0/

Notes on vocabulary

  • You can roughly think that one node is a single computer
  • A node on a supercomputer contains:
    • One or more central processing units (CPUs) with many cores
    • Shared memory
  • Some nodes may also have:
    • Local storage
    • Graphics processing units (GPUs)

Cluster systems

  • Login nodes are used to set up jobs (and to launch them)
  • Jobs are run on the compute nodes
  • A batch job system (scheduler) is used to run and manage the jobs
    • On CSC machines, we use Slurm
    • Other common systems include SGE and Torque/PBS
    • The syntax is different but basic operation is similar

Available HPC resources

  • Puhti is the general-purpose supercomputer ☑️
  • Mahti is the massively parallel flagship supercomputer
  • LUMI is a European pre-exascale supercomputer operated by CSC
  • Pouta provides cloud resources via OpenStack (IaaS)
  • Rahti provides containers via OKD (PaaS)
  • Allas provides object storage for all services

Which supercomputer to use?

  • What kind of resources can your application use?
    • Can it use more than one core?
    • How much memory will it need?
    • Can it use a GPU or an NVMe?
    • What takes long (i.e., the time-limiting part) in your job?
  • See what kind of resources are available?
    • Is my code already installed?
    • Max runtime, partitions (queues), provisioning policy (Per core/per node/other)
    • Each system is different, so check the documentation

Quick and dirty comparison of Puhti, Mahti and LUMI

Puhti Mahti LUMI
Pre-installed apps 120+ 20+ See here
Cores per node 40 128 128
Job size (min-max cores) 1-1040 128-25600 1-65536
Memory per node (GiB) 192-1536 256 256-1024
GPU cards 320 (V100) 96 (A100) 10240 (MI250X)
Fast local disks (NVMe) 106 CPU, 80 GPU 24 GPU 8 CPU, 8 GPU

In short: Mahti is for large parallel jobs, prepare to install and optimize your code. Still, one Puhti node is 10x your laptop. LUMI is like Mahti + massive AMD GPU capacity