Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A case of a missing VFR_ED_TR in precision.vcf #38

Open
bricoletc opened this issue May 24, 2022 · 1 comment
Open

A case of a missing VFR_ED_TR in precision.vcf #38

bricoletc opened this issue May 24, 2022 · 1 comment

Comments

@bricoletc
Copy link
Member

Hey Marto,

I have the following case from a cortex-evaluated vcf (organism: Pf, gene: EBA175) (vcf fields broken into lines on purpose)

Pf3D7_07_v3
1360357
UNION_BC_k31_var_11058
GCACTGAAATAGCACACAGAACGGAAACTCGTACGGATGAACGAAAAAATCAGGAACCAGCAAATAAGGATTTAAAGAATCCACAACAAAGTGTAGGAGAGAACGGAACTAAAGATTTATTACAAGAAGATTTAGGAGGATCACGAAGTGAAGACGAAGTGACACAAGAATTTGGAGTAAATCATGGAATACCTAAGGGTGAGGATCAAACGTTAGGAAAATCTGACGCCATTCCAAACATAGGCGAACCCGAAACGGGAATTTCCACTACAGAAGAAAGTAGACATGAAGAAGGCCACAATAAACAAGCATTGTCTACTTCAGTCGATGAGCCTGAATTATCTGATACACTTCAATTGCATGAAGATACTAAAGAAAATGATAAACTACCCCTAGAATCATCTACAATCACATCTCCTACGGAAAGTGGAAGTTCTGATACAGA
ACACTGAAATAGCACACAGAAC
.
PASS
KMER=31;SVLEN=-423;SVTYPE=COMPLEX
GT:COV:GT_CONF:VFR_ED_RA:VFR_ED_TA:VFR_ALLELE_LEN:VFR_ALLELE_MATCH_COUNT:VFR_ALLELE_MATCH_FRAC:VFR_IN_MASK:VFR_RESULT
1/1:0,33:235.46:2:0:22:22:1.0:0:TP

The formula for precision is essentially (VFR_ED_TR - VFR_ED_TA )/ VFR_ED_TR, but VFR_ED_TR is missing. Though if we reason here, VFR_ED_TA is 0, so cortex's call is perfect, so VFR_ED_TR = edit_distance(ref,alt).

i)Is that reasoning at the end correct?
ii)Is this a bug so to speak. If so I can probably push a PR in doing the above if you want

@martinghunt
Copy link
Member

i) I think the reasoning is correct.
ii) It's a "feature". See the comments:

# Can have cases where VFR_ED_TR is not present. This is the edit
# distance between the truth and ref allele. Calculated from mapping
# of ref probe to the truth.
# Cases are:
# - FP because the alt probe does not map to the truth ref. In this
# case we don't map the ref probe to the truth.
# - FP, alt probe mapped, but the truth probe did not map
# - TP or Partial_TP, and we have VFR_ED_RA,VFR_ED_TA but not VFR_ED_TR.
# This is where alt probe mapped, but ref probe did not map, or no
# mapping found in same place as the alt mapping
# (either way, is counted as no hit).

I don't mind if you want to change the behaviour to add in VFR_ED_TR, but I'm not sure about edge cases that we might not think of. Can imagine a boatload of tests failing if you try changing, but go for it if you want...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants