Skip to content

grp-bork/reCOGnise

Repository files navigation

reCOGnise workflow

Bork Group Logo Developed by the Bork Group
Raise an issue or contact us

See our other Software & Services
Contributors:
Collaborators:
The development of this workflow was supported by NFDI4Microbiota NFDI4Microbiota icon

Description

reCOGnise is a tool/pipeline for species assignment (specI cluster) of microbial genomes using COG marker genes. reCOGnise workflow is a port of a workflow that was used e.g. for species assignment of the proGenomes database.


Citation

This workflow: DOI

Also cite:

Fullam A, Letunic I, Schmidt TSB, et al. proGenomes3: approaching one million accurately and consistently annotated high-quality prokaryotic genomes. Nucleic Acids Res. 2023;51(D1):D760-D766. doi:10.1093/nar/gkac1078
Coelho LP, Alves R, Del Río ÁR, et al. Towards the biogeography of prokaryotic genes. Nature. 2022;601(7892):252-256. doi:10.1038/s41586-021-04233-4

Overview

reCOGnise Workflow Diagram


Requirements

reCOGnise requires a docker/singularity installation. All dependencies are contained in the reCOGnise docker container.

Dependencies are

  • prodigal
  • fetchMGS.pl
  • MAPseq

Usage

Cloud-based Workflow Manager (CloWM)

This workflow will be available on the CloWM platform (coming soon).

Command-Line Interface (CLI)

You can either clone this repository from GitHub and run it as follows

git clone https://github.com/grp-bork/reCOGnise.git
nextflow run /path/to/reCOGnise --input_dir /path/to/genome/fastas --output_dir /path/to/output_dir

Input genome fasta files have to have one of the following file endings: {fna,fasta,fa,fna.gz,fasta.gz,fa.gz}. Alternatively, you can set the pattern with params.file_pattern = "**.{<comma-separated-list-of-file-endings>}".

Or, you can have nextflow pull it from github and run it from the $HOME/.nextflow directory.

nextflow run grp-bork/reCOGnise --input_dir /path/to/genome_files --output_dir path/to/output_dir