Skip to content

Latest commit

 

History

History
111 lines (79 loc) · 4.66 KB

README.rst

File metadata and controls

111 lines (79 loc) · 4.66 KB

DeepGRP - Reproducibility

Repository for reproducing results for DeepGRP publication (currently under review).

Installing requirements

You can install all required packages using poetry with:

git clone https://github.com/fhausmann/deepgrp_reproducibility
cd deepgrp_reproducibility
poetry install

Note

To fully reproduce the results from the deepgrp paper you need to have a version of RepeatMasker here with cross_match and a version of Repbase which cannot be provided due to licensing restrictions.

To use the packages installed via poetry, you have to activate the poetry environment via:

poetry shell

or run your command using:

poetry run <your command>

Getting the data

You can download all required training/testing data and required programs with make:

poetry run make

Warning, this can take a while, depending on your connection.

Reproducing

All results in the paper are generated with hyperparameter in best_model.toml.

These hyperparameter where found using the following search space:

Parameter Parameter Name Distribution
Window size vecsize q-normal(\mu = 200, \sigma = 20, q = 2)
Recurrent units units q-normal(\mu = 32, \sigma = 5, q = 2)
Dropout dropout Uniform(low = 0, high = 0.4)
RMSprop momentum momentum Uniform(low = 0, high = 1.0)
RMSprop decay rho Uniform(low = 0, high = 1.0)
Learning rate learning_rate Lognormal(\mu = -7, \sigma = 0.5)
Repeat probability per batch repeat_probability Uniform(low = 0, high = 0.49)

The performance and benchmarking results can be downloaded as json files from results. All trained models can be found at models.

Training

Training of DeepGRP can be done with the jupyter notebook Training_deepgrp.ipynb and dna-nn with Training_dnabrnn.ipynb.

Benchmark

Benchmark can be done with Benchmark.ipynb.

To evaluate the resuts from the benchmark experiments, use Evaluation.ipynb.

Figures

All figures of the paper can be generated with Figures.ipynb. They will be saved in a figures subfolder.