Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could iCount xlsites support soft-clipping? #215

Open
TomSmithCGAT opened this issue Mar 11, 2024 · 0 comments
Open

Could iCount xlsites support soft-clipping? #215

TomSmithCGAT opened this issue Mar 11, 2024 · 0 comments

Comments

@TomSmithCGAT
Copy link

When using an aligner which is allowed to soft-clip the reads, this can error out with an index error at poss[idx] as len(read.query_length()) used to define idx will be greater than len(read.get_reference_positions(). Would it be possible for iCount to support use of aligners with soft-clipping?

poss = read.get_reference_positions()
if read.is_reverse:
strand = '-'
xlink_pos = poss[-1] + 1
end_pos = poss[0]
else:
strand = '+'
xlink_pos = poss[0] - 1
xlink_pos = 1 if xlink_pos < 1 else xlink_pos # Case of neg. pos on circular MT
end_pos = poss[-1]
chrom = read.reference_name
second_start, is_strange = _second_start(read, poss, strand, chrom, segmentation, gap_th)
if is_strange:
metrics.strange_recs += 1
# Position of middle nucleotide. Because we can have spliced reads, middle position is
# not necessarily the middle of region the read maps to. Take one nucleotide upstream
# of center in case length is even, happens by default on + strand
idx = read.query_length // 2 - 1 if (strand == '-' and read.query_length % 2 == 0) \
else read.query_length // 2
middle_pos = poss[idx]

As an example, BAM read:

NS500105:608:HHF3NBGXN:4:22601:5900:10521:N:0:GTCCGCrbc:TTAAGAC 147 chr1 189874 0 46S30M = 189693 -211 CACGACATCCTCCTCCCAGTCGCCCCTGTAGCTCCCCTACCTCCAAGAGGGTGTGGGATGGTGGAGGGGTTTGAGA EEEAEA6EAAAAEEEEEEEEEEEAEEEEEEEEEEEEEEEEEEEEE/EEEAEEAEEEEEEEEEAEEEEEEEEAAA/A AS:i:60 XS:i:98 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:30 YS:i:96 YT:Z:CP NH:i:2

poss: [189873, 189874, 189875, 189876, 189877, 189878, 189879, 189880, 189881, 189882, 189883, 189884, 189885, 189886, 189887, 189888, 189889, 189890, 189891, 189892, 189893, 189894, 189895, 189896, 189897, 189898, 189899, 189900, 189901, 189902] (length=30)
idx:37

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant