-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QC pipeline with script only #103
Comments
The QC pipeline just runs FASTQC and samtools stats/plot-bamstats. You could run these commands, but they are nothing special, just wrappers around those programs:
|
Thank you, @martinghunt; it works flawlessly. Now I can entirely switch new version. Sincerely, |
How you decide a sample is bad and remove it is up to you :) There's no set method of doing so and it depends on what analysis you're doing. You could remove samples up front, eg if (making up example numbers) <90% of the genome has coverage >20X. Or if a low % of reads map or the reads are low quality (eg error rate from samtools). You could remove samples after variant calling, eg for TB if a sample has >10k variants, or if it has a lot of "heterozygous" calls (both those things suggest contamination). |
Thanks very much, @martinghunt. These recommendations are helpful for me. I already used clockwork when it was a part of sp3 platform developed by Oxford University, but now this platform is going down, so I follow step by step their workflows, but something I can not handle. Sincerely, |
Hello,
I want to update the new version, but I can not see how to QC when running only the script without tracking the database. I want to use this as an older version (FastQC and Samtools QC). Please give me a guide so I can QC my data before analysis.
Sincerely,
Trung
The text was updated successfully, but these errors were encountered: