Integrative model of the WDR76-SPIN1-Nucleosome complex

This repository is of the integrative model of the WDR76-SPIN1-Nucleosome complex based on data from chemical crosslinking, X-ray crystallography, and structure prediction from Alphafold. It contains input data, scripts for data preprocessing, modeling and results including bead models and localization probability density maps. The modeling was performed using IMP (Integrative Modeling Platform).

The integrative structure is deposited in the PDB with accession code 9A8I (PDB-Dev accession: PDBDEV_00000382)

Directory structure

data : contains the subdirectories for the input data used for modeling all the subcomplexes.
scripts : contains all the scripts used for modeling and analysis of the models.
results : contains the models and the localization probability densities of the top cluster of the subcomplexes .
test : scripts for testing the sampling

Protocol

Preprocessing the crosslinks

For crosslinks in sheetA of sheetA of data/xlinks/original_suppmat_DataS3.xlsx:

python get_protein_uniprot_mapping.py -x /home/shreyas/Dropbox/washburn_wdr_spin/xls_sheet1.csv

Make a file proteins_of_interest.txt and use it to run:

python get_protein_uniprot_mapping.py -x /home/shreyas/Dropbox/washburn_wdr_spin/xls_sheet1.csv -m mapping -p proteins_of_interest.txt

Finally, to generate the input file for modeling, do:

python xl_preprocessing.py ~/Dropbox/washburn_wdr_spin/xls_sheet1.csv uniprot_mapping.yaml

For crosslinks in sheetA of sheetA of data/xlinks/original_suppmat_DataS3.xlsx:
Run the following command to generate the crosslinks input file for modeling:
```
python xl_change_protnames_from_ncbi2name.py
```

Sampling

To run the sampling, run modeling scripts like this

for runid in `seq 1 50` ; do mpirun -np 8 $IMP python scripts/modeling.py prod $runid ; done

where,
$IMP is the setup script corresponding to the IMP installation directory (omit for binary installation).

Analysis

1. Getting the good scoring models

Good scoring models were selected using pmi_analysis (Please refer to pmi_analysis tutorial for more detailed explaination) along with our variable_filter_v1.py script. These scripts are run as described below:

First, run run_analysis_trajectories_w_skip2.py as follows:
$IMP run_analysis_trajectories_w_skip2.py modeling run_
where, $IMP is the setup script corresponding to the IMP installation directory (omit for binary installation),
modeling is the directory containing all the runs and
run_ is the prefix for the names of individual run directories.
Then run variable_filter_v1.py on the major cluster obtained as follows:
$IMP variable_filter_v1.py -c N -g MODEL_ANALYSIS_DIR where, $IMP is the setup script corresponding to the IMP installation directory (omit for binary installation),
N is the cluster number of the major cluster,
MODEL_ANALYSIS_DIR is the location of the directory containing the selected_models*.csv.
This can also be run using the submit_variable_filter_v1.sh script from the scripts/analysis/pmi_analysis directory.
Please also refer to the comments in the variable_filter_v1.py for more details.
The selected good scoring models were then extracted using run_extract_models.py as follows:
$IMP python run_extract_models.py modeling run_ CLUSTER_NUMBER
where, $IMP is the setup script corresponding to the IMP installation directory (omit for binary installation),
modeling is the path to the directory containing all the individual runs and
CLUSTER_NUMBER is the number of the major cluster to be extracted.

2. Running the sampling exhaustiveness tests (Sampcon)

A separate directory named sampcon was created and a density.txt file was added to it. This file contains the details of the domains to be split for plotting the localisation probability densities. Finally, sampling exhaustiveness tests were performed using imp-sampcon as $imp python $sampcon/pyext/src/exhaust.py -n wdr_spin -a -m cpu_omp -c 2 -d analysis/density.txt -gp -g 2.0 -sa ../model_analysis/A_gsm_clust1.txt -sb ../model_analysis/B_gsm_clust1.txt -ra ../model_analysis/A_gsm_clust1.rmf3 -rb ../model_analysis/B_gsm_clust1.rmf3.

3. Analysing the major cluster

Crosslink violations were analyzed as follows:

for xlfile in data/xlinks/modeling_xlfile_sheetA.dat data/xlinks/modeling_xlfile_sheetD.dat; do python get_xlviol_val_set_v2.py sampcon_cluster0_models.rmf3 $xlfile 35.0 & done

Average distance maps were plotted for the models from the major cluster as follows: scripts/analysis/cosmic_and_distance-maps/submit_contact_maps_all_pairs_surface.py
This script calls the scripts/analysis/cosmic_and_distance-maps/contact_maps_all_pairs_surface.py script. Please use --help for contact_maps_all_pairs_surface.py script for more details.
Plot distance versus model index plots for the models in the major cluster as follows:
```
python scripts/analysis/plot_mdlike.py
```

Results

For the simulations, the following files are in the results directory

cluster_center_model.rmf3 : representative bead model of the major cluster
chimera_densities.py : to view the localization densities (.mrc files)
xlviol : Directory containing the logs for crosslink violations
dmaps : Directory containing the average distance maps computed for all protein pairs
prism_output : Directory containing the output from PrISM

Information

Author(s): Shreyas Arvindekar, Shruthi Viswanath
License: CC BY-SA 4.0 This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.
Last known good IMP version: Not tested
Testable: Yes
Parallelizeable: Yes
Publications: Liu X, Zhang Y, Wen Z, Hao Y, Banks CAS, Cesare J, Bhattacharya S, Arvindekar S, Lange JJ, Xie Y, Garcia BA, Slaughter BD, Unruh JR, Viswanath S, Florens L, Workman JL, Washburn MP. An integrated structural model of the DNA damage-responsive H3K4me3 binding WDR76:SPIN1 complex with the nucleosome. Proc Natl Acad Sci U S A. 2024 Aug 13;121(33):e2318601121. https://doi.org/10.1073/pnas.2318601121

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
results		results
scripts		scripts
test		test
.gitignore		.gitignore
F1.png		F1.png
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Integrative model of the WDR76-SPIN1-Nucleosome complex

Directory structure

Protocol

Preprocessing the crosslinks

Sampling

Analysis

1. Getting the good scoring models

2. Running the sampling exhaustiveness tests (Sampcon)

3. Analysing the major cluster

Results

Information

About

Releases

Packages

Contributors 2

Languages

License

isblab/wdr76_spin1_nucleosome

Folders and files

Latest commit

History

Repository files navigation

Integrative model of the WDR76-SPIN1-Nucleosome complex

Directory structure

Protocol

Preprocessing the crosslinks

Sampling

Analysis

1. Getting the good scoring models

2. Running the sampling exhaustiveness tests (Sampcon)

3. Analysing the major cluster

Results

Information

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages