Aminoacyl-tRNA Synthetases (aaRSs) are a class of enzyme central to translation, facilitating the interaction of tRNAs to a cognate canonical amino acid (CAA). To enable expansion of the natural chemical toolbox to encompass non-canonical amino acids (NCAAs), new enzymes of this class have to be enginereed to accept the alternative chemistries offered by NCAAs, while remaining inert to pre-existing CAAs.
The engineering workflow facilitates structure prediction of the complete
permutational complexity of the defined mutant landscape through Rosetta Cartesian energy minimization, followed by blind docking of
native and target (exogenous) amino acids by CB-Dock.
Mutants are then scored for fitness using the Delta and RMSD metrics.
Delta:
- Measures enzyme engineered favourability as:
NCAA Dock Score - Native Dock Score
- Lower the score, the greater the engineered affinity for the NCAA over the native substrate
RMSD:
- Estimation metric of the mutant producing a productive docking pose with the target NCAA
- Root-Mean-Square Deviation of the exogenously docked NCAA to the crystal-structure derived docking position of the native amino acid
- Lower the RMSD, the greater the mutant appears to dock the NCAA in a productive binding pose.
This pipeline uses the Rosetta 3 cartesian_ddg script for structure prediction.
This software suite is incredibly large (~18Gb compressed) and so can't be packaged
within this repository. Download here and move the uncompressed directory with binaries (or compile yourself), to the /resources/ directory.
(Version rossetta_bin_linux_2020.08.61146_bundle)
This is a Nextflow pipeline and as such, can only be run on a POSIX OS, if using windows I'd recommend using the Windows Subsystem for Linux (WSL)
git clone https://github.com/J-E-J-S/aaRS-pipeline
conda env create -f environment.yml
conda activate aaRS-pipeline
./aaRS-pipeline.sh -i -m -r
- Create a
mutations.txt
file in the/inputs/
directory
Must be in form:
X99 MNQ
Y100 MNQ
Z101 MNQ
Where:
- X is the single-letter ID for the wild-type residue to be mutated
- The number '99' is the residue number of the mutable position
- MNQ is the pool of residues to be mutated at this position
- Add the native and exogenous (target) amino acids to the
/inputs/
directory
- Amino acids have to be labelled as
nativeLigand.mol2
andexogenousLigand.mol2
respectively - Amino acids must be in
.mol2
standard - The native amino acids must be taken from the known crystal structure of the template enzyme
- This permits the RMSD calculation to estimate the productivity of the NCAA docking pose
- Add the template enzyme
.pdb
file to the/inputs/
directory
- To install the prerequisite scripts run the shell script with the
-i
option (this only has to be performed once)./aaRS-pipeline.sh -i
- To prepare the mutational file system to begin pipeline flow, run the shell script with the
-m
option (this has to be performed for every new template enzyme or mutational context)
./aaRS-pipeline.sh -m
- To begin the pipeline, run the shell script with the
-r
option
./aaRS-pipeline.sh -r
or run cmdnextflow run main.nf
- To run pipeline from beginning to end as a new user, combine all options
./aaRS-pipeline.sh -i -m -r
Results are compiled into a JSON object that can be found in the /output/
directory, in the form:
{
mutantID:
{
'exogenousScores': [],
'nativeScore': float,
'Delta': [],
'RMSD' : [],
'structurePath': string,
'dockingPath': string
}
}
This file becomes the store for all the data for the pipeline run.
- Run the script:
python queryResults.py <mutantQn> <rmsdCutOFf> <./../output/results.json>
in the /bin/ dir
Where:
- mutantQn = number of mutants desired to filter down to
- rmsdCutOff = the minimal accepted RMSD value (2Å recommended)
- /output/results.json = path to the pipeline generated results.json
Creates individual summary directories for mutants with the structure.pdb
,
docking_results.mol2
and a summary.fasta
file, as well as an overall
summary.fasta
file.
- Manually inspect the results in PyMOL against the wild-type template
- Look for similarity in binding mode of the target NCAA to the native CAA
- Mutant Structure
.pdb
and NCAA docking results.mol2
are compiled into the queried results generated directory