→ Slides, Tutorial, VideoThis session is full.We will start the session with a quick refresher on the basics of bash. I will then introduce a few well known unix tools and features of the shell with a focus on how to use these to make key bioinformatics tasks easier and more efficient.
Getting the most out of your shell (bash centric)As bioinformaticians we regularly deal with directories filled with hundreds of files and have to manage running an equally large number of parallel jobs. There are many features of the shell that can make this easier. Here I will focus on some of the key ones that I use often.
Tools: bash (loops, functions, strings), xargs, parallel
Manipulating tabular dataLots of bioinformatics data is tabular, gff, vcf, sam. Using these formats as examples I will introduce some useful tools for manipulating tabular data
Tools: cut, paste, awk, shuff, comm
Manipulating sequence data
Manipulating sequence data like fasta and fastq requires specialised bioinformatic tools. Two very useful ones are samtools and bioawk. This section will show you how to easily accomplish common tasks like splitting, sampling or reformatting a large sequence file.
Tools: samtools, bioawk
Prerequisites- A laptop with a modern web browser