Skip to content

Installation

Aiden Lab edited this page Apr 3, 2017 · 23 revisions

Quick Start

  1. Verify that you've installed all dependencies.
  2. Set up your directories. You should have a Juicer directory containing scripts/, references/, and optionally restriction_sites/, and a different working directory containing fastq/. You should download the Juicer tools jar and install it in your scripts/ directory.
  3. Run Juicer.

Example with CPU version. <myJuicerDir> is whereever you want to store the Git repository (keeping in mind you will want to pull updates). <my_reference_fastas_and_indices> is your reference assembly and the BWA index files. <myRestrictionSiteDir> will contain the restriction site files, see below for more information. <fastq_files> are your sequenced reads; they can remain gzipped.

cd 
git clone https://github.com/theaidenlab/juicer.git
cd <myJuicerDir>
ln -s ~/juicer/CPU scripts
mkdir references
cp <my_reference_fastas_and_indices> references/
# this is optional, only needed for fragment-delimited files
ln -s <myRestrictionSiteDir> restriction_sites
cd <myWorkingDir>
mkdir fastq
mv <fastq_files> fastq/
<myJuicerDir>/juicer.sh -D <myJuicerDir> 

Dependencies

Cluster Specification

Juicer currently works with the following resource management software:

Make sure to copy the appropriate scripts from the github repo to your cluster or laptop, as well as the fastq reads and appropriate reference files. You should download the Juicer tools jar and install it in your scripts/ directory.

Directory Structure

See the Box mirror for an easy-to-navigate view of the directory structure.

The following also shows a sample configuration of the all the files and directories on the cluster once Juicer is fully set up. It assumes that all files needed by Juicer are created under /opt/juicer.

You can also access another public mirror of these files by going to https://s3.amazonaws.com/juicerawsmirror/opt/juicer/[paths_below], for example: https://s3.amazonaws.com/juicerawsmirror/opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R1_001.fastq.gz.

# tmp directory
/opt/juicer/tmp

# sample work directory is /opt/juicer/work/HIC003
/opt/juicer/work/HIC003/fastq
/opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R1_001.fastq.gz
/opt/juicer/work/HIC003/fastq/HIC003_S2_L001_R2_001.fastq.gz

# another sample work directory is /opt/juicer/work/MBR19
/opt/juicer/work/MBR19/fastq
/opt/juicer/work/MBR19/fastq/chr19_R1.fastq.gz
/opt/juicer/work/MBR19/fastq/chr19_R2.fastq.gz

# Core Juicer scripts from github in /opt/juicer/scripts
/opt/juicer/scripts/chimeric_blacklist.awk
/opt/juicer/scripts/statistics.pl
/opt/juicer/scripts/stats_sub.awk
/opt/juicer/scripts/split_rmdups.awk
/opt/juicer/scripts/countligations.sh
/opt/juicer/scripts/juicer_tools
/opt/juicer/scripts/juicer_tools.jar
/opt/juicer/scripts/juicer_postprocessing.sh
/opt/juicer/scripts/dups.awk
/opt/juicer/scripts/juicer.sh
/opt/juicer/scripts/LibraryComplexity.class
/opt/juicer/scripts/hicInternalMenu.properties
/opt/juicer/scripts/abnormal.awk
/opt/juicer/scripts/check.sh
/opt/juicer/scripts/fragment.pl
/opt/juicer/scripts/makemega_addstats.awk
/opt/juicer/scripts/mega.sh
/opt/juicer/scripts/relaunch_prep.sh
/opt/juicer/scripts/cleanup.sh

# Sequence reference files in /opt/juicer/references
# hg19 and mm9 reference files in mirror
/opt/juicer/references/Homo_sapiens_assembly19.fasta
/opt/juicer/references/Mus_musculus_assembly9_norandom.fasta

Make sure to copy the appropriate scripts from the github repo to your cluster as well as the fastq reads and appropriate reference files.

After running BWA indexing (might take a couple hours) bwa index <fasta file>:

# after running BWA indexing
/opt/juicer/references/Homo_sapiens_assembly19.fasta.sa
/opt/juicer/references/Homo_sapiens_assembly19.fasta.ann
/opt/juicer/references/Homo_sapiens_assembly19.fasta.amb
/opt/juicer/references/Homo_sapiens_assembly19.fasta.pac
/opt/juicer/references/Homo_sapiens_assembly19.fasta.bwt
/opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.bwt
/opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.amb
/opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.pac
/opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.ann
/opt/juicer/references/Mus_musculus_assembly9_norandom.fasta.sa

(OPTIONAL): After building the restriction sites files python generate_site_positions.py <enzyme> <genome ID> <fasta> https://github.com/theaidenlab/juicer/blob/master/misc/generate_site_positions.py (If you don't use fragment delimited resolutions, you must run Juicer with the -x flag)

# restriction sites files in /opt/juicer/restriction_sites
/opt/juicer/restriction_sites/mm9_HindIII.txt
/opt/juicer/restriction_sites/mm10_MboI.txt
/opt/juicer/restriction_sites/mm10_DpnII.txt
/opt/juicer/restriction_sites/hg19_MboI.txt
/opt/juicer/restriction_sites/hg38_MboI.txt
/opt/juicer/restriction_sites/hg38_DpnII.txt
/opt/juicer/restriction_sites/hg19_DpnII.txt
/opt/juicer/restriction_sites/hg19_HindIII_new.txt
/opt/juicer/restriction_sites/mm9_DpnII.txt