Code and processed data for Garruss et al. 2021 "Deep Representation Learning Improves Prediction of LacI-mediated Transcriptional Repression"
Please refer the manuscript for a full description of computational methods, external code/data sources, and primary references.
See the Gene Expression Omnibus (GEO) Series GSE175456 and Series GSM1940482 for the raw paired-end sequencing reads for this study.
Structure of this repository:
The directory "laci_selection" contains the processed data files for experimental repression values,
the directory "laci_modelling" contains Molecular Modelling and Simulation notes and files for Rosetta,
the directory "laci_conservation" contains the alignment and coupling files,
the directory "machine_learning" contains sructured/indexed input data and the code for ML model comparison.
Reach out to garruss@fas.harvard.edu with any questions.