Repository of in-house codes and useful files used in the paper Stress-induced RNA-chromatin interactions promote endothelial dysfunction.
For this part, the following software are needed:
These are the scripts used for this part:
iMARGI_data_processing.sh
is the main script for raw data processing, from fastq files to the BEDPE file with the uniquely mapped read pairs.split_iMARGI_reads_chromosome_by_chromosome.sh
is used to split the BEDPE file in 576 single BEDPE files, one per each chromosome pair.Make_WholeGenome_Matrix.sh
is used to generate all the contact matrices from the single BEDPE files, at a specified resolution and one per each chromosome pair.
These are the scripts used for this part:
HUVEC_iMARGI.r
is the main script used for analysis and parsing of the data, in order to generate data structures suitable for network plotting, and generating data summary or reports.plot_network_function.r
contains a custom function built in-house using the R package igraph in order to plot and customize super enhancer networks, such as node dimension, color, labels, edge width, etc. The function is called in the main script "HUVEC_iMARGI.r".HUVEC_iMARGI_replicate_2.r
is used for analysis and parsing of the data of biological replicate 2.HUVEC_iMARGI_summary.r
is used for generating all the summary plots in the paper for iMARGI.plot_iMARGI_maps.sh
is a bash script used to plot iMARGI contact matrices by running the Python script "plot_iMARGI.py".plot_iMARGI.py
contains the functions to plot iMARGI contact matrices.
- The published software HiCtool (v2.2) was used for Hi-C data analysis and visualization, such as data pre-processing, data normalization, contact heatmap and correlation heatmap visualization, TAD and A/B compartment analyses.
HUVEC_HiC.r
serves to calculate general Hi-C data statistics, Measure of Concordance (MoC) of TAD boundaries between samples, average interaction frequency by genomic distance curves, and proportion of reads mapped within TADs.
RNA-seq analysis was mainly performed using the R package DESeq2 (v1.24.0). These scripts were used for RNA-seq analysis:
HUVEC_RNAseq.sh
is a bash script used for alignment (performed using STAR (v2.5.4b)) and to runfeatureCounts
from the package Subread (v2.0.0), to obtain the raw count data to input in DESeq2.HUVEC_RNAseq.r
contains the code from DESeq2 to perform the RNA-seq data analysis and visualization.
scRNA-seq analysis was mainly performed using the R package Seurat (v2.3.4). These scripts were used for single-cell RNA-seq analysis:
cellranger.sh
is a bash script used to run cellranger from 10X Genomics both for in vitro and in vivo models.HUVEC_scRNAseq.r
is the main R script used for data analysis and visualization for the in vitro model (HUVEC cultured cells), based on functions from Seurat.HUVEC_scRNAseq_human_vascular.r
is the main R script used for data analysis and visualization for the in vivo model (human vascular cells), based on functions from Seurat.