This version is modified to use nvbio
or BarraCUDA
for metagenotyping using MIDAS2, enabling use of GPU in the analysis. Specifically, BarraCUDA
, nvBWT
and nvBowtie
are used internally. Currently, the codes use BarraCUDA
as nvBowtie
outputs wrong SAM flag for the unmapped reads. Thus, the path to BarraCUDA
should be set in environmental variables. The installation instruction for the BarraCUDA
is here.
Metagenomic Intra-Species Diversity Analysis (MIDAS) is an integrated pipeline for profiling strain-level genomic variations in shotgun metagenomic data. The standard MIDAS workflow harnesses a reference database of 5,926 species extracted from 30,000 genomes (MIDAS DB v1.2). MIDAS2 used the same analysis workflow as the original MIDAS tool, and is engineered to work with more comprehensive MIDAS Reference Databases (MIDASDBs), and to run on collections of thousands of samples in a fast and scalable manner.
For MIDAS2, we have already built two MIDASDBs from large, public, microbial genome databases: UHGG 1.0 and GTDB r202.
Publication is available in Bioinformatics. User manual is available at ReadTheDocs.
The performance of reads mapping based metagenotyping pipeline depends on (1) how closely related the DB reference genomes are to the strains in the samples being genotyped, and (2) post-alignment filter options, and etc. Pitfalls of genotyping microbial communities with rapidly growing genome collections can be found here.