Skip to content
This repository has been archived by the owner on Oct 12, 2018. It is now read-only.

Latest commit

 

History

History
90 lines (63 loc) · 5.06 KB

README.md

File metadata and controls

90 lines (63 loc) · 5.06 KB

Notice

This project is no longer maintained. Any logical and performance differences between this program and the lumpy_filter program maintained as part of lumpy-sv have been addressed. You should move to using lumpy_filter. The easiest way to do so is to use smoove.


Version install with bioconda DOI License

Build Status Coverage Status

Description

The purpose of this program is to extract splitter and discordant reads from a CRAM or BAM file using logic identical to SAMBLASTER. This allows the generation of splitter and discordant files without name-sorting the input file. Unlike SAMBLASTER which appends '_1' and '_2' to splitter read names, read names in the splitter output file are altered by changing the first character to an 'A' for read1 and a 'B' for read2.

Usage Notes

Splitters and discordants are output in BAM files. Duplicates are included by default, but can be excluded using the -e option. As of version 1.2.0, threading affects the performance of both BAM and CRAM files and specifying more than one thread will speed up the program significantly. CRAM is supported as an input format, however, I highly recommend that when running on a CRAM file the -T option is utilized. The -T option prevents htslib from downloading the reference sequence used to encode the CRAM to the REF_CACHE location. By default, this is in the current user's home directory and may prove problematic for those with smallish home directories. See the htslib documentation for more information.

Credits

This program is heavily based on code from SAMBLASTER, unpublished code from Ryan Layer and code written by Travis Abbott in diagnose_dups.

Compilation and Installation

Currently, extract_sv_reads must be compiled from source code. It is routinely tested using both the gcc and clang compilers on Ubuntu 12.04. It should work on other Unix-based operating systems, but they are not supported.

External Dependencies

Included Dependencies

Boost 1.59, htslib 1.6, and zlib 1.2.8 are included with the source code and will be utilized during compilation. Older versions of Boost will not work if specified directly.

Basic Build Instructions

Installing dependencies

  • For APT-based systems (Debian, Ubuntu), install the following packages:
sudo apt-get install build-essential git-core cmake

Download a stable version or clone the repository

Download and extract the code of the latest release

or clone from the master branch using git

git clone git://github.com/hall-lab/extract_sv_reads.git

Build the program

extract-sv-reads does not support in-source builds. So create a subdirectory, enter it, build, and run tests:

mkdir extract_sv_reads/build
cd extract_sv_reads/build
cmake ..
make -j
make test

Tests should pass. The binary extract-sv-reads can then be found under extract_sv_reads/build/bin. If you have administrative rights, then run sudo make install to install the tool for all users under /usr/bin.

Building with additional libraries

htslib can be linked against curl for interaction with AWS and GCS. In addition, it can be linked with lzma and bz2 for full read support of all types of CRAM files. To enable these features install the following packages.

Dependencies

  • For APT-based systems (Debian, Ubuntu):
sudo apt-get install libbz2-dev liblzma-dev libssl-dev libcurl4-openssl-dev 

Building

mkdir extract_sv_reads/build
cd extract_sv_reads/build
cmake -DHTSLIB_USE_LIBCURL=1 -DHTSLIB_USE_LZMA=1 -DHTSLIB_USE_BZ2=1 ..
make -j
make test

Citing

Please cite extract-sv-reads using its DOI. Note that this link corresponds to the latest version. If you used an earlier version then your DOI may be different and you can find it on Zenodo.

Getting Help

Please open issues on the github repository to obtain help.