longMetARG

Nextflow pipeline for the identifiation of antibiotic resistance genes in metagenomic long reads

Introduction

longMetARG is a bioinformatics pipeline that can be used to find antibiotic resistance genes in metagenomic long reads.

The pipeline takes in (--input) a fasta file with long read sequences. It then is passed to generate statistics by NanoPlot and is either processed directly by mapping the reads to the curated antibiotic resistance-gene database (CARD) with minimap2 or is passed to an assembly process using metaFlye prior to mapping. Aligned reads/contigs are filtered by identity and coverage length only keeping the best non-overlapping hits. The resulting files are passed to a process using PlasFlow to determine the probable location of the gene (plasmid or chromosome), to the taxonomy process where the filtered sequences are analysed by either Blast or Diamond and to annotate the most probable taxonomic names. All output files are stored in the output directory (--outdir). The summary process joins the main output features in the summary.tsv file.

The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner.

Pipeline summary

Get Started

Install Nextflow (>=21.04.0).
Install Conda or Miniconda (Docker will be enabled soon).
Install the CARD database following the instructions on https://card.mcmaster.ca/. Either instal the card_database_v*.fasta in the directory containing longMetARG main.nf in a folder named card_db of indicate the location of the folder and databases file name using the command --argDb.
Store the file aro_index.tsv from the [CARD ontology](https://card.mcmaster.ca/](https://card.mcmaster.ca/download) in the /bin folder of the working directory or indicate its location with the command --drugClass.
Install the reference database for Blast (default: ref_prok_rep_genomes) or Diamond (default: nr.dmnd) in the working directory naming the folders blast_db or diamond_db, respectively. If the reference database already exists in your system, indicate location and name using --blastDbDir --blastDbName or --diamondDbDir --diamondDbName.
Download the pipeline and test it on a minimal dataset (test.fasta) running the command inside the folder containing the installed files: ./nextflow run main.nf --input test.fasta -profile conda
For more information on command-line options run ./nextflow run main.nf --help.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
bin		bin
env		env
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.nf		main.nf
nextflow.config		nextflow.config
nf_longMetARG_flowchrt.jpg		nf_longMetARG_flowchrt.jpg
test.fasta		test.fasta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

longMetARG

Introduction

Pipeline summary

Get Started

About

Releases

Packages

Languages

License

louperelo/longmetarg

Folders and files

Latest commit

History

Repository files navigation

longMetARG

Introduction

Pipeline summary

Get Started

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages