A snakemake workflow for mitochondrial short variant analysis using GATK Best Practices
git clone https://github.com/sysbiocoder/Mito-gatk.git
cd Mito-gatk
git checkout dev
Run mkdir resources
and link the reference files
Run mkdir singularity
and link the singularity containers for GATK, Picard and VEP.
(All files can be found in /medstore/projects/P22-005/Kristina/Mito-gatk
)
cd Mito-gatk
mkdir haplocheck
wget https://github.com/genepi/haplocheck/releases/download/v1.3.2/haplocheck.zip
unzip haplocheck.zip
Download the needed VEP cache using the Ensembl FTP and store in a directory in resources/vep/cache
. Keep the subdirectory structure as is.
The workflow parses two files, 1) samples.tsv containing sample information and 2) reads.tsv containing read and sequencing information. The files need to have the same columns as the example files below.
The path to these are set in config.yaml
Example samples.tsv:
sample population
E-2843_Liver polg_wt_mtDNA
E-2843_Liver_pcr polg_wt_lrPCR
T-2767_Liver polg_ko_mtDNA
T-2767_Liver_pcr polg_ko_lrPCR
Example reads.tsv:
unit sample reads id PU
210507_TESTSTEST E-2843_Liver data/raw/SRR21601404_1.fastq.gz 1 TESTSTEST
210507_TESTSTEST E-2843_Liver data/raw/SRR21601404_2.fastq.gz 2 TESTSTEST
210507_TESTSTEST E-2843_Liver_pcr data/raw/SRR21601420_1.fastq.gz 1 TESTSTEST
210507_TESTSTEST E-2843_Liver_pcr data/raw/SRR21601420_2.fastq.gz 2 TESTSTEST
210507_TESTSTEST T-2767_Liver data/raw/SRR21601450_1.fastq.gz 1 TESTSTEST
210507_TESTSTEST T-2767_Liver data/raw/SRR21601450_2.fastq.gz 2 TESTSTEST
210507_TESTSTEST T-2767_Liver_pcr data/raw/SRR21601416_1.fastq.gz 1 TESTSTEST
210507_TESTSTEST T-2767_Liver_pcr data/raw/SRR21601416_2.fastq.gz 2 TESTSTEST
snakemake -s workflow/Snakefile --software-deployment-method conda apptainer
OBS! (I will fix this) Currently the VEP container need to have two --bind arguments set when running the workflow, and therefore it can only be run separately after the other parts are done. Line 46 in Snakefile
and line 226 in inputfunctions.smk
need to be uncommented. Annotation with VEP can then be run with:
snakemake -s workflow/Snakefile --software-deployment-method conda apptainer --apptainer-args "--bind $(pwd)/resources/vep/cache:/opt/vep/.vep --bind $(pwd)/results/variants:/results/variants"