Skip to content

Latest commit

 

History

History
93 lines (55 loc) · 7.08 KB

README.md

File metadata and controls

93 lines (55 loc) · 7.08 KB

Annotation of Ostrea lurida transcriptome: analysis of gonads in control and ocean acidification conditions

For my project, I analyzed four different Olympic oyster (Ostrea lurida) gonad nucleotide sequences with accompanying quality scores (.fastq files). These sequences are from male and female specimens, exposed to either control or ocean acidification conditions. I also had a previously assembled O. lurida transcriptome for comparison.

My goal was to annotate the O.lurida transcriptome using the gonad nucleotide information, and understand how ocean acidification conditions affect differential expression in O. lurida gonads.

Objectives

  1. Identify differential expression between samples exposed to control and ocean acidification conditions
  2. Characterize gene ontology information associated with differential gene expression
  3. Understand basic bioninformatic analysis techniques (FastQC, MultiQC, BLAST, etc.)
  4. Create reproducible protocols for anlaysis
  5. Produce a written report of findings

Project Timeline

Week 3:

Week 4:

Week 5-6:

Week 7:

Week 8

Week 9

Week 10

Directory Structure

For more information regarding each subdirectory, see that subdirectory's README.md file

data/: Data used for project analyses

Some files in .gitignore, but directory on local machine includes the following files:

O. lurida .fastq files

O. lurida transcriptome

kallisto quant count data (.txt files)

kallisto quant count data from 11-29-2016 (.txt files)

analyses/: Output for multiple analyses

oly_oa_gonad_FASTQC: O. lurida analysis reports from FastQC interactive application

oly_oa_gonad_FASTQC_commandline: O. lurida analysis reports generated using FastQC in the command line

oly_oa_gonad_multiqc: FastQC analyses compiled into one report using MultiQC

oly_oa_gonad_blastx: Best matches between O. lurida transcriptome and Uniprot database with gene ontology information

irrelevant analyses: Folder with analyses not used to generate final results. Includes DESeq2 output from pairwise comparisons and DESeq2 output from 2016-11-16.

11-29-oly-oa-gonad-DESeq2: revised R scripts, graphs and .tab files associated with DeSeq2 analysis

oly_oa_gonad_GO_enrichment: Files, scripts and images associated with gene enrichment analysis in DAVID

oly_oa_gonad_REVIGO: Scatterplots, tree maps, R scripts and term tables generated using REVIGO

results/: Overall project results

blastx: Matched contigs in O. lurida transcriptome with Uniprot Database and gene ontology information

Differentially expressed genes from DESeq2: .tab file with genes and .png DESeq2 plot

Gene enrichment analyses from DAVID: .tab file with differentially expressed genes with associated sequences and Uniprot information, .txt files with Uniprot Accession codes, and several Functional Annotation tables from DAVID

Gene ontology analysis in REVIGO: Scatterplots, tree maps and tables for biological processes, cellular components and molecular function GO terms

scripts/: Subdirectory with scripts used in project

tutorials/: Step-by-step tutorials for different programs

BLAST tutorial + acccompanying data files

DESeq2 tutorial + acccompanying data files

CoGe tutorial + accompanying files

Bash scripting tutorial + accompanying files

notebooks/: Jupyter notebooks that detail reproducible methods used for data analysis