Home

kakapo

Studies in many fields within life sciences increasingly rely on RNA sequencing (RNA-seq) data. As a result, RNA-seq datasets deposited to the NCBI Sequence Read Archive (SRA) are proliferating. In addition to serving as an archive for the original studies, these datasets present an opportunity for novel research.

kakapo is a pipeline that allows users to extract and assemble a specified gene or a protein family from any number of SRA accessions (or their own RNA-seq data). Kakapo identifies open reading frames in the assembled transcripts and annotates them using InterProScan. Additionally, raw reads can be filtered for ribosomal, plastid, and mitochondrial reads or reads belonging to non-target organisms (viral, bacterial, etc.)

kakapo can be flexibly employed to extract arbitrary loci, such as those commonly used for phylogenetic inference in systematics.

A brief overview of kakapo from the Botany 2020 conference

Installation and Usage

Workflow Overview

Process RNA-seq reads
Filter RNA-seq reads
1. Bowtie 2
2. Kraken 2
Filtered Read Set: Alternative Stopping Point
Produce BLAST databases for filtered RNA-seq reads
Process query sequences
Search for RNA-seq reads
Targeted transcript assembly
Transcript annotation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

kakapo

A brief overview of kakapo from the Botany 2020 conference

Installation and Usage

Workflow Overview

Clone this wiki locally