-
Notifications
You must be signed in to change notification settings - Fork 5
TODOs
-
Create an example genomes.fasta from public sequences
-
Deal with alts - produce all combinations of alts and at a relatively lower proportion (1/2^n)
-
Tune bowtie2 so that amplicon dropout occurs in line with experiments
-
Tune bowtie2 to allow ACGT -> N substitutions in primer sites, at the moment only single N insertions are allowed
-
What to do with genome ends (amplicon 1 and 98)? ATM they are dropped fairly consistently because the leftmost and rightmost primer sites don't exist on most genomes that we look at except for the Wuhan reference.
-
Simulate other PCR products (how? Chimeras -> Simera + Point mutations -> with a script? /ignore Chimeras for now.)
-
Test the 'exact' amplicon distribution method (Nicola's method) make sure it works as intended - this is already implemented but needs testing. Additionally, on this point, Nicola asked for multinomial sampling from each set of reads and also for the ability to read from a SAM or BAM file, this is currently not supported.
-
Cruddiness parameter should be able to be different for each genome, and a place for that information is in the tsv file containing the genome abundances. If there's a command-line value, use that for all the genomes, if there isn't then look in the genome abundances, otherwise use a default.
-
For the loop of reads, each time it's on the outer loop (of genomes) keep track.