Graphtyper flag may need to be added if working with Nanopore data otherwise empty merged vcf file #70

masudermann · 2024-05-02T18:21:54Z

Description of the bug

In discussion with Alex, Upasana, and Fernanda, we realized that we likely need to include the flag
--no_filter_on_proper_pairs when using the graphtyper genotype command with the 150-bp trimmed long reads.

(Thanks to Alex for bringing to our attention these additional parameters under graphtyper genotype --advanced --help).

If you don't use this, we found that if you merge your individual vcf files, using graphtyper vcf_concatenate command, the final merged output file is empty and no variants are called.

Command used and terminal output

No response

Relevant files

No response

System information

No response

The text was updated successfully, but these errors were encountered:

zachary-foster · 2024-05-02T22:31:42Z

I just checked and reproduced this behavior in the pipeline. Without --no_filter_on_proper_pairs there are no variants when only using nanopore samples, but with --no_filter_on_proper_pairs there are variants. Anyone know of a reason this flag should not be included when there are no nanopore samples? Its easy to add it all of the time, but if we need to only add it when nanopore samples are included, then it is a bit more work, but still not bad probably.

masudermann · 2024-05-03T00:04:09Z

That is a good question and something I wondered too.

I'm looking into what happens when I call variants with and without the flag, for a small dataset of short read samples.

masudermann · 2024-05-03T19:52:19Z

I did a fast experiment. I had 6 short read p. ramorum samples and I ran graphtyper exactly the same, except for the added flag or not. I then only filtered SNPs as the pipeline does.

When I look at pairwise SNP differences between samples, results are very similar, but not identical.

It seems for each sample pair, there are between 15-30 more SNP differences identified in the graphtyper analysis where the flag is used.

Here is the matrix when I don't include the flag:

Here is the matrix when I do include the flag:

zachary-foster · 2024-05-03T20:29:24Z

Nice! Those look very similar. If that is representative of most datasets then I think we can just leave this flag always on for now.

masudermann changed the title ~~Graphtyper parameter may need to be added if working with Nanopore data otherwise empty merged vcf file~~ Graphtyper flag may need to be added if working with Nanopore data otherwise empty merged vcf file May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Graphtyper flag may need to be added if working with Nanopore data otherwise empty merged vcf file #70

Graphtyper flag may need to be added if working with Nanopore data otherwise empty merged vcf file #70

masudermann commented May 2, 2024

zachary-foster commented May 2, 2024

masudermann commented May 3, 2024

masudermann commented May 3, 2024 •

edited

Loading

zachary-foster commented May 3, 2024

Graphtyper flag may need to be added if working with Nanopore data otherwise empty merged vcf file #70

Graphtyper flag may need to be added if working with Nanopore data otherwise empty merged vcf file #70

Comments

masudermann commented May 2, 2024

Description of the bug

Command used and terminal output

Relevant files

System information

zachary-foster commented May 2, 2024

masudermann commented May 3, 2024

masudermann commented May 3, 2024 • edited Loading

zachary-foster commented May 3, 2024

masudermann commented May 3, 2024 •

edited

Loading