-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limitations of --output-vcf-path and --log-file #15
Comments
Would it be preferable to support compressed output via writing to standard output, or having the program handling piping through bgzip? One advantage of the latter strategy is that it would be possible for a single invocation of the program to process multiple input files, and avoid repeated reading and processing of some of the information in the manifest. |
Yes, I see your point. Not sure what to suggest. You probably don't want to implement bgzip within the code, and that actually isn't even my use case as I always use compressed binary VCF (a.k.a. .bcf) instead. I thought though that allowing /dev/stdout for single VCF conversion would be an easy fix though. |
It wouldn't be necessary to implement bgzip, just internally pipe to a bgzip process. Though to your point, output to standard out is trivial and implicitly supports multiple compression mechanisms. I can put this into a pull request. |
There is a candidate branch at https://github.com/Illumina/GTCtoVCF/tree/standard_output if you'd like to make any additional comments |
There are two limitations inconsistencies that I believe would be important to resolve:
It would be nice to be able to output to stdout as this could be piped into tools like bcftools to generate compressed binary files rather than generate really large uncompressed text files. Ideally, the default behaviour could be to output to stdout if --output-vcf-path is not used
2) --log-file generates a log file (more or less including what is output on stderr) but even if the option is active the program will generate all the log output to stderr, duplicating this output. It is great that by default the log goes to stderr, but if the --log-file option is used then stderr should not be used instead
Last, I think it would be beneficial if the check for whether the manifest file matches the manifest file used in the gtc files happened in the very beginning, so that the user could promptly fix that without having to wait a very long time.
The text was updated successfully, but these errors were encountered: