Releases · hepcat72/vcfSampleCompare

06 Dec 16:53

hepcat72

23309a7

Patch 2 to initial public release of version 2 Latest

Latest

This utility sorts and (optionally) filters the rows/variants of a VCF file (containing data for 2 or more samples) based on the differences in the variant data between samples or sample groups. Degree of "difference" is determined by either the best possible degree of separation of sample groups by genotype calls or the difference in average allelic frequency of each sample or sample group (with a gap size threshold). The pair of samples or sample groups used to represent the difference for a variant row is the one leading to the greatest difference in consistent genotype or average allelic frequencies (i.e. observation ratios, e.g. AO/DP) of the same variant state. If sample groups are not specified, the pair of samples leading to the greatest difference is greedily discovered and chosen to represent the variant/row.

Currently, there are 2 separate output files and filtered lines of the VCF file are omitted. The next major version will instead incorporate sample comparison data into the VCF format and set the filter column value.

This incremental release fixes an issue with the handling of vcf files which have been merged using bcftools merge, which, when it combined sample columns where the number of ALT values differs and no data is present for the sample containing a single ALT value, a single dot is not expanded to a comma-separated series of dots. Previous versions, upon encountering this, would die with a fatal error.

Assets 2

15 Oct 18:30

hepcat72

v2.008

97e103e

Patch to Initial public release of version 2

Currently, there are 2 separate output files and filtered lines of the VCF file are omitted. The next version will instead incorporate sample comparison data into the VCF format and set the filter column value.

This incremental release fixes an issue with the handling of no genotype calls when creating initial minimum sample groups. Previously, in genotype mode, if there weren't enough samples with genotype calls to fill the minimum group sizes, the script would encounter a fatal error. Now, such rows are correctly processed.

Assets 2

08 Oct 15:40

hepcat72

v2.006

bfc01c9

Initial public release of version 2

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: hepcat72/vcfSampleCompare

Patch 2 to initial public release of version 2

Patch to Initial public release of version 2

Initial public release of version 2