→ Snakemake Tutorial
Key: - | XB | IP | TD | - | CLThis session introduces command line and text based workflow management systems by the examples of Snakemake and Nextflow. We will start with an overview of concepts and challenges then spend an hour on each of these platforms.
SnakemakeWe will show how to define a workflow in the Snakemake workflow language, and how to execute it using the Snakemake command line interface. In particular, we will show how Snakemake enables reproducible science by allowing
- automation of every step of a data analysis from raw data to final figures
- scalability of the workflow to any major computing architecture (compute server, cluster, grid, cloud) without having to modify the workflow definition
- portability of the workflow by integration with the Conda package manager and Singularity containers.
NextflowThis session will introduce the Nextflow framework, the tool basic concepts and how it enables the definition and the deployment of large-scale distributed computational pipeline in a portable and reproducible manner across cloud and clusters. In particular it will be discussed:
- installation and introduction to the dataflow processing model
- workflows parallelisation and scalability
- portable workflows containerisation with Docker, Singularity and Shifter
- cloud deployment strategies
Prerequisites- Linux command line experience