Introduction to containers

All material (C) 2020-2021 by CSC -IT Center for Science Ltd. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 Unported License, http://creativecommons.org/licenses/by-sa/4.0/

The one-slide lecture

  • Some software packages on CSC supercomputers are installed as containers
    • May cause some changes to usage
    • See instructions for each software for details
  • Containers provide an easy way for you to install software
    • Single command installation if a suitable Docker or Singularity container available

Containers

  • Containers are a way to package software with its dependencies (libraries, etc)
  • Popular container engines include Docker, Singularity, Shifter
  • Singularity is the most popular in HPC environments

Containers vs. Virtual Machines (1/2)

Containers vs. Virtual Machines (2/2)

  • Virtual machines can run totally different OS than host (e.g. Windows on Linux host or vice versa)
  • Containers share kernel with host, but can have its own libraries etc
    • Can run e.g. different Linux distribution than host

Container benefits: Ease of installation

  • Containers are becoming a popular way to distribute software
    • Single command installation
    • All dependencies included, so more portable
    • Normal user rights enough when using an existing container
  • Root access on build system is enough
    • Root access, package managers (yum, apt, etc) can be utilized even when not available in the target system.
    • Makes installing libraries etc. easier

Container benefits: Environment isolation

  • Containers use host system kernel, but can have their own Bins/Libs layer
    • Can be a different Linux distribution that the host
    • Can solve some incompatibilities
    • Less likely to be effected by changes in the host system

Container benefits: Enviroment reproducibility

  • Analysis environment can be saved as a whole
    • Useful with e.g. Python, where updating underlaying libraries (Numpy etc) can lead to differences in behavior
  • Sharing with collaborators easy (single file)

Docker in a nutshell

  • Probably the most popular container engine
    • Large selection of software available as Docker containers
  • Requires root access to run
    • Not suited for HPC environments
  • Containers can be writable at run time
    • Can cause problems when trying to run with Singularity

Docker basic usage

  • Pull a container from registry

    docker pull <container>
  • Run a container

    docker run [options] <container>
    
    • Will pull the container automatically if local image not available

Singularity in a nutshell

  • Containers can be run with user level rights
    • But: Building new containers requires root access or access to remote build service
  • Minimal performance overhead
  • Supports MPI
    • But: Requires containers tailored to host system
  • Can use host driver stack (Nvidia/cuda)
    • Add option --nv
  • Can import and run Docker containers
    • Running Docker directly would require root rights

Singularity on CSC servers

  • Singularity jobs should be run as batch jobs or with sinteractive
  • No need to load a module
  • Users can run their own containers
  • Some CSC software installations provided as containers
    • See software pages for details

Running Singularity containers: Basic syntax

  • Execute a command in the container

    singularity exec [options] <container> <command>
  • Run the default action (runscript) of the container

    • Defined when the container is built
    singularity run [options] <container>
  • Open a shell in the container

    singularity shell [options] <container>

Getting help

  • Check the developer documentation

  • Check container help

    singularity run-help image.sif
    • Only available if added when container was created
  • Try running find inside the container to find file paths

    singularity exec image.sif find / -type f -name my_app.py 2>/dev/null
    • Requires that find is available in the container

File system

  • Containers have their own, internal file system
    • The internal FS is always read-only when run with user level rights
  • To access host directories, they need to be mapped to container directories
    • E.g. to map host directory /scratch/project_12345 to directory /data inside the container: --bind /scratch/project_12345:/data
    • Target directory inside the container does not need to exist. It is created as necessary
    • More than one directory can be mapped

Environment variables

  • Most environment variables from host are inherited by the container
  • Can be prevented, if necessary, by adding option --cleanenv
  • Environment variables can be set specifically inside the container by setting in host $SINGULARITYENV_variablename.
    • E.g. to set $TEST in container, set $SINGUALRITYENV_TEST in host

singularity_wrapper

  • Running containers with singularity_wrapper takes care of most common --bind commands

    singularity_wrapper exec image.sif myprog <options>
  • If environment variable $SING_IMAGE is set with the path to the image, even image file can be omitted

    singularity_wrapper exec myprog <options>

Using Docker containers with Singularity

  • You can build a Singularity container from a Docker container with normal user rights:

    singularity build <image> docker://<address>:<tag>
  • For example:

    singularity build pytorch_19.10-py3.sif docker://nvcr.io/nvidia/pytorch:19.10-py3
  • Documentation in Docs:

Singularity containers as installation method

  • Singularity is a good option in cases where installation is otherwise problematic:
    • Complex installations with many dependencies
    • Obsolete dependencies incompatible with general environment
      • Still needs to be kernel compatible
  • Should be considered even when other methods exist

Just a random example (FASTX-toolkit)

  • Tried different installation methods:
    • Native: 47 files, total size 1,9 MB
    • Conda: 27464 files, total size 1,1 GB
    • Singularity: 1 file, total size 339 MB
  • Containers are not the solution for everything, but they do have their uses…