to do: write some documentation...
Quick how-to:
-
Use conda
-
You need a folder named
results/
in which you have the output files of Genome Detective- filenames as
[run_id]_[sample_id]_[extension]
, e.g. "3_1_results.csv" - extensions are: "results.csv" for the 'assigned' contigs, "discovery.csv" for the 'discovered' contigs and "results.xml" for the XML files.
- filenames as
-
You also need a folder named
tmp/
to store intermediate files
conda env create -n genomedetective -f envs/GD_parsed.yaml
source activate genomedetective
snakemake -p
Or open a Jupyter Notebook and play with the .ipynb
files in the bin/
directory.
-
A CSV file derived from the XML (for easier reading/use in spreadsheets). This file contains:
- the run ID (number)
- the sample ID (number/code)
- the total number of reads for that sample
- the number of "low quality reads", i.e. those discarded by trimmomatic
- the number of "non viral reads", i.e. those not recognised by DIAMOND as viral (against the UniRef90 + HIV from UniRef50 database)
- the number of viral reads (by DIAMOND)
- total runtime in seconds
-
A report table (as CSV) that combines results from the CSV and XML output files. This file summarises in 23 columns all information that may be interesting in comparing the assignments by Genome Detective to the (validated) qPCR results. (Note that some fields still have to be filled in manually!)
-
Heatmaps of the reported viruses ('virome comparison tool')
-
A table describing the results in the CAMI profiling format.