This workshop is split into 2 parts where we will go through an example of using Janis to build a genomic variant-calling pipeline.
Workflows from this workshop are adopted from the following GATK (Broad Institute)'s WDL pipelines with modifications to simplify the tasks for the purpose of this workshop.
- https://github.com/gatk-workflows/gatk4-data-processing
- https://github.com/gatk-workflows/gatk4-germline-snps-indels
The main goal of this workshop is to introduce Janis for building portable pipeline. We will be using small test datasets for demonstrations. Please note that some of the bioinformatics details, such as tools' parameters, genome references and databases, might not be complete. Please consider reviewing the pipeline details at the end of this workshop if you are planning to use this on other samples.
- Janis Documentation: https://janis.readthedocs.io/en/latest
- Janis GitHub: https://github.com/PMCC-BioinformaticsCore/janis
- This workshop GitHub: https://github.com/PMCC-BioinformaticsCore/janis-training
For the first session, we will get ourselves familiar with Janis.
Description | |
---|---|
30 minutes | - Installing and setting up Janis Environment - Running a small workflow as a test |
30 minutes | - Learn about preconfigured tools - Using BWA mem + samtools view - Add Mark Duplicates - Running a small test |
30 minutes | - Add SortSam + SetNmMdAndUqTags - Test the pipeline |
30 minutes | - Going through exercise' solutions - Q&A |
For the second session, we will complete our portable germline variant-calling pipeline
Description | |
---|---|
30 minutes | Adding new tools definition in Janis - Create Janis' GATK ApplyBQSR + GATK BaseRecalibrator - Add new tools to workflow - Test updated pipeline |
30 minutes | Exercise: Adding more tools to complete germline pipeline - Add GATK HaplotypeCaller - Add new tool to workflow - Test updated pipeline |
30 minutes | Wrap-up - Going through exercise' solutions - Q&A |
- Python 3.6+
- Docker