Skip to content

Using reference-free compressed data structures to analyse thousands of human genomes (1000 Genomes ReadServer)

Notifications You must be signed in to change notification settings

Zhicheng-Liu/ReadServer

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ReadServer

Using reference-free compressed data structures to analyse thousands of human genomes simultaneously

For more details, please see our manuscript.


Before you start


This project includes most of its requirements as git submodules. However, many of the modules, as well as the main project itself, require a C++11 compatible compiler (GCC versions from 4.8.1 onwards should work). The following table contains a (non-exhaustive) list of Unix/Linux distributions and the version from which onwards they (should) have a compatible GCC version:

Distribution Version Release date (YYYY-MM-DD) GCC
Fedora 19 2013-07-02 4.8.1
Ubuntu 13.10 2013-10-17 4.8.1
openSUSE 13.1 2013-11-19 4.8.1
Mint 16 2013-11-30 4.8.1
Debian 8.0 2015-04-26 4.9.2
Red Hat Enterprise Linux RHEL-7.2 2015-11-19 4.8.5
CentOS 7-1511 2015-12-14 4.8.5
Scientific Linux 7.2 2016-02-05 4.8.5
FreeBSD 10.3 2016-04-04 4.8.4

If you install this software on a system that has an incompatible version set as default compiler, make sure to set the CC and CXX shell variables to the location of the binaries of a C++11 compatible compiler and export them in your shell:

export CC='/usr/local/gcc_4_9/bin/gcc-4.9.1'
export CXX='/usr/local/gcc_4_9/bin/g++-4.9.1'

You might also have to add the path to the corresponding 'lib' directories to your LIBRARY_PATH and LD_LIBRARY_PATH variables:

export LD_LIBRARY_PATH='/usr/local/gcc_4_9/lib:/usr/local/gcc_4_9/lib64':$LD_LIBRARY_PATH
export LIBRARY_PATH='/usr/local/gcc_4_9/lib:/usr/local/gcc_4_9/lib64':$LIBRARY_PATH

NOTE:
In case you are planning to compile and run the project in a HPC/Farm/Cluster environment, make sure that you compile on a system with the minimum consensus resource set that is available on ALL nodes in your computing environment! Some components in this project (e.g. 'RocksDB') optimise themselves for the hardware found on the machine they are compiled on. For instance, if the compiling system has SSE 4.2 instruction set the final project will only run on machines with that instruction set.


We tested installation & compilation on a freshly set up Ubuntu 16.04 LTS operating system and found the following required packages not to be included in the standard installation:

  • cmake
  • automake
  • libtool
  • texi2html
  • texinfo
  • docbook2x
  • zlib1g-dev
  • libbz2-dev

On an Ubuntu system (or any other distribution using the apt package manager and compatible repositories) you can install them with:

sudo apt install cmake automake libtool texi2html texinfo docbook2x zlib1g-dev libbz2-dev

(if you don't have administrator rights on your system please ask your IT for help)



Installation


You can get a clone of this repository by typing

git clone https://github.com/wtsi-svi/ReadServer

or

git clone git@github.com:wtsi-svi/ReadServer.git

in your commandline.


After you have cloned this project, change to the git project directory (e.g. '/usr/local/ReadServer') and execute:

bash install_dependencies.sh

The script should automatically fetch all required submodules and compile them. Afterwards run:

make clean
make
make install

Should you encounter compilation errors when running 'make', add

CC=<path_to_c++11_compatible_c_compiler_binary> CXX=<path_to_c++11_compatible_c++_compiler_binary>

to the 'make' command, e.g.

make CC='/usr/local/gcc_4_9/bin/gcc-4.9.1' CXX='/usr/local/gcc_4_9/bin/g++-4.9.1'

(the paths should be identical to those set for the shell variables).


Once installation is finished, change to the 'demo' sub directory and follow the instructions in 'README.md' to build a demonstration population BWT ReadServer.

About

Using reference-free compressed data structures to analyse thousands of human genomes (1000 Genomes ReadServer)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 57.7%
  • Shell 27.6%
  • C 10.5%
  • Perl 2.2%
  • Makefile 1.4%
  • Protocol Buffer 0.6%