Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UMI retention with single-end reads bug #13

Open
anoronh4 opened this issue Mar 1, 2023 · 0 comments
Open

UMI retention with single-end reads bug #13

anoronh4 opened this issue Mar 1, 2023 · 0 comments

Comments

@anoronh4
Copy link
Collaborator

anoronh4 commented Mar 1, 2023

Under specific conditions, the UMI sequence is removed by the STAR aligner. In this bug the reads must have the read designated with a /, e.g.:

@SRR5665260.1.36873006/1

and the sample must also be single-ended and have UMI. After extraction using umi_tools extract, the UMI appears as follows:

@SRR5665260.1.36873006/1_CAG

but STAR reads up to the / and discards the rest, such that the UMI barcode that follows is discarded. if umi_tools extract is used on paired end reads the read ID is printed differently:

@SRR5665260.1.36873006_CAG

and is therefore properly parsed by STAR.

We could add an extra process to fix the read IDs but this would take extra storage and is probably better handled with a proper bug fix in umi_tools. this is not urgent as single-ended reads are not a common use case for us and the /1 suffix is similarly rare. we can put it on the backburner for now but i opened an issue for it on the UMI-tools github

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant