Skip to content

Metagenomic Intra-Species Diversity Analysis 2 using GPU

License

Notifications You must be signed in to change notification settings

noriakis/MIDAS2NV

 
 

Repository files navigation

Metagenomic Intra-Species Diversity Analysis 2

This version is modified to use nvbio or BarraCUDA for metagenotyping using MIDAS2, enabling use of GPU in the analysis. Specifically, BarraCUDA, nvBWT and nvBowtie are used internally. Currently, the codes use BarraCUDA as nvBowtie outputs wrong SAM flag for the unmapped reads. Thus, the path to BarraCUDA should be set in environmental variables. The installation instruction for the BarraCUDA is here.

DOI

Metagenomic Intra-Species Diversity Analysis (MIDAS) is an integrated pipeline for profiling strain-level genomic variations in shotgun metagenomic data. The standard MIDAS workflow harnesses a reference database of 5,926 species extracted from 30,000 genomes (MIDAS DB v1.2). MIDAS2 used the same analysis workflow as the original MIDAS tool, and is engineered to work with more comprehensive MIDAS Reference Databases (MIDASDBs), and to run on collections of thousands of samples in a fast and scalable manner.

For MIDAS2, we have already built two MIDASDBs from large, public, microbial genome databases: UHGG 1.0 and GTDB r202.

Publication is available in Bioinformatics. User manual is available at ReadTheDocs.

The performance of reads mapping based metagenotyping pipeline depends on (1) how closely related the DB reference genomes are to the strains in the samples being genotyped, and (2) post-alignment filter options, and etc. Pitfalls of genotyping microbial communities with rapidly growing genome collections can be found here.

About

Metagenomic Intra-Species Diversity Analysis 2 using GPU

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.4%
  • Shell 5.3%
  • Dockerfile 1.1%
  • Makefile 0.2%