Skip to content

Latest commit

 

History

History
38 lines (30 loc) · 1.89 KB

tutorial_local_docker.md

File metadata and controls

38 lines (30 loc) · 1.89 KB

Tutorial for general UNIX computers with docker

  1. Download cromwell.

    $ wget https://github.com/broadinstitute/cromwell/releases/download/34/cromwell-34.jar
    $ chmod +rx cromwell-34.jar
  2. Git clone this pipeline and move into it.

    $ git clone https://github.com/ENCODE-DCC/chip-seq-pipeline2
    $ cd chip-seq-pipeline2
  3. Download a SUBSAMPLED paired-end sample of ENCSR936XTK.

    $ wget https://storage.googleapis.com/encode-pipeline-test-samples/encode-chip-seq-pipeline/ENCSR936XTK/ENCSR936XTK_fastq_subsampled.tar
    $ tar xvf ENCSR936XTK_fastq_subsampled.tar
  4. Download pre-built chr19/chrM-only genome database for hg38.

    $ wget https://storage.googleapis.com/encode-pipeline-genome-data/test_genome_database_hg38_chr19_chrM_chip.tar
    $ tar xvf test_genome_database_hg38_chip.tar
  5. Run a pipeline for the test sample.

    $ INPUT=examples/local/ENCSR936XTK_subsampled_chr19_only.json
    $ PIPELINE_METADATA=metadata.json
    $ java -jar -Dconfig.file=backends/backend.conf cromwell-34.jar run chip.wdl -i ${INPUT} -o workflow_opts/docker.json -m ${PIPELINE_METADATA}
  6. It will take about 6 hours. You will be able to find all outputs on cromwell-executions/chip/[RANDOM_HASH_STRING]/. See output directory structure for details.

  7. See full specification for input JSON file.

  8. You can resume a failed pipeline from where it left off by using PIPELINE_METADATA(metadata.json) file. This file is created for each pipeline run. See here for details. Once you get a new input JSON file from the resumer, use it INPUT=resume.[FAILED_WORKFLOW_ID].json instead of INPUT=examples/local/ENCSR936XTK_subsampled_chr19_only.json.