Generate contaminants file #28

pinin4fjords · 2024-02-20T11:54:27Z

Description of feature

sortmerna is implemented in the pipeline and runs by default. There will also be a bunch of other short RNA species we should remove, which we can use the (also inherited) bbsplit functionality.

But we do need to derive a list of contaminant sequences and figure out where to store it.

JackCurragh · 2024-04-17T09:30:32Z

So is it just rRNA that is removed by default? I am not clear on what the combination of bbsplit and sortmerna achieve so it is hard to know what kinds of contaminants you have in mind (tRNA, phiX?).

pinin4fjords · 2024-04-17T10:12:43Z

I came to the conclusion that a blanket cross-species set was not practical.

For test_full I used the usual rRNA complement with human tRNA sequences added (https://github.com/nf-core/test-datasets/blob/riboseq/testdata/rrna-db-full.txt), but this will be down to the user I think- so maybe this is a documentation issue.

pinin4fjords added the enhancement New feature or request label Feb 20, 2024

pinin4fjords added this to the v1.1.0 milestone Feb 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate contaminants file #28

Generate contaminants file #28

pinin4fjords commented Feb 20, 2024

JackCurragh commented Apr 17, 2024

pinin4fjords commented Apr 17, 2024

Generate contaminants file #28

Generate contaminants file #28

Comments

pinin4fjords commented Feb 20, 2024

Description of feature

JackCurragh commented Apr 17, 2024

pinin4fjords commented Apr 17, 2024