CSC Computing Services for Biousers
This is a webinar on using CSC services for biousers who are new to CSC environment.
Outline
- Getting Started with CSC Services
- CSC Computing Environment (HPC/Cloud)
- CSC Data Storage Environment (Allas)
- Easy-to-use Web Applications for Bioinformatics
- Sensitive Data Services at CSC
- Research Data Management for Biousers
- Training
- Take-home Message
Getting Access to CSC Services
- Need to have a CSC account to access CSC services
- Use CSC customer portal (MyCSC) to create an account
- Requires Haka or Virtu user ID (= your institutional e-mail ID)
- Contact our ServiceDesk if you are not part of Haka/Virtu federation
- Keep your account alive by updating it once a year
- Use your CSC account to manage projects/CSC services/resources
- Usage of services is tied to CSC Projects (aka, computing projects)
- CSC services are free of charge for academic use
CSC Supercomputing Environment
- Motivation:
- Lot of pre-installed applications in CSC environment
- Need for huge computational resources (Memory, CPUs, GPUs)
- Limited resources on your local clusters
- Parallel and high-throughput computing
- Better optimised programmes in CSC environment
- CSC supercomputing options:
- Puhti – general purpose supercomputer
- Mahti – massively parallel flagship supercomputer
- Lumi – one of the EuroHPC pre-exascale supercomputers
Which Supercomputer (Puhti vs. Mahti) to Choose From ?
Number of preinstalled applications |
123+ |
16+ |
Cores per node |
40 |
128 |
Job size (min-max) cores |
1-1040 |
128-25600 |
Memory per node (GiB) |
192-1536 |
256 |
GPU cards (NVIDIA) |
120 x V100 |
96 x A100 |
Fast node local disk (NVMe) |
120 |
(24 GPU nodes) |
In short: Mahti is for much larger parallel jobs, prepare to install and optimize your code. (Still, a Puhti node is > 10x your laptop.)
Cloud Computing Use Case
- Motivation:
- Need for root access
- Privately deploying tools with web interfaces
- CSC private cloud (ePouta) for sensitive data
- Avoid standing in batch queues for the execution of jobs
- Advanced users – able to manage servers
- Difficult workflows – can’t run on Puhti
CSC Cloud Computing Models
- Infrastructure as a Service (IaaS)
- Platform as a Service (PaaS)
- Software as a Service (SaaS)
CSC Allas Storage Environment
- Apply for Allas service in your project and get default quota of 10 TB
- Allas is
- A storage service for all computing and cloud services
- A gateway for uploading data from personal laptops or organizational storage systems to CSC environment
- Meant for data during project lifetime
- For sharing the data with others at CSC
- Allas is NOT a
- File system
- Data management environment
- Backup service.
Connecting to Allas
- In Puhti and Mahti, setup connection to Allas with the commands:
module load allas
allas-conf
- Allas supports two protocols
- S3 (used by: s3cmd, rclone, a-tools)
- Swift (used by: swift, rclone, a-tools, cyberduck)
- Possible to use GUI clients among the list of clients for accessing Allas
Fairdata.fi Services
- With the Fairdata.fi services you can store, share, describe and publish your research data with easy-to-use web tools.
- IDA – Research data storage service
- ETSIN – Research data finder
- QVAIN – Research dataset metadata tool
- FAIRDATA-PAS – Digital preservation for research data
- Fairdata services are offered by the Minedu and produced by CSC