recursiveCorPlot - natural clustering of RNA-seq data

Introduction
Installation
Citing recursiveCorPlot
- Usage

Introduction

For classical hierarchical clustering of RNA-seq data, the use of Euclidean distances as distance metric often result in unnatural clusters. For example, if the clustering contains genes with only a few samples with strong up-regulation by hyper-amplifications, these will weigh heavily at the Euclidean distance(s). This distance metric is therefore sensitive to outliers. Instead, correlation based clustering (distance = 1 – correlation(m)) is more common for RNA-seq data, where spearman’s rank can be used to more aggressively suppress outliers. We observed some genes, relatively rich in zero counts, of which the correlation to all other genes are somewhat lower, but the correlations consistently went in the same direction as other genes within a cluster. Since the directions of the correlation are consistent with other genes but the data didn’t seem powerful enough, we took the correlation of the correlation as the distance metric: distance = 1 – correlation(correlation(m)). This distance metric was clustered hierarchically using the “ward.D2”” method, showing neat natural clusters.

Installation

You can install recursiveCorPlot from Github using:

devtools::install_github("yhoogstrate/recursiveCorPlot")

Citing recursiveCorPlot

Please cite this paper when using recursiveCorPlot for your publications:

Youri Hoogstrate, Kaspar Draaisma, Santoesha A. Ghisai, Levi van Hijfte, Nastaran Barin, Iris de Heer, Wouter Coppieters, Thierry P.P. van den Bosch, Anne Bolleboom, Zhenyu Gao, Arnaud J.P.E. Vincent, Latifa Karim, Manon Deckers, Martin J.B. Taphoorn, Melissa Kerkhof, Astrid Weyerbrock, Marc Sanson, Ann Hoeben, Slávka Lukacova, Giuseppe Lombardi, Sieger Leenstra, Monique Hanse, Ruth E.M. Fleischeuer, Colin Watts, Nicos Angelopoulos, Thierry Gorlia, Vassilis Golfinopoulos, Vincent Bours, Martin J. van den Bent, Pierre A. Robe, Pim J. French,
Transcriptome analysis reveals tumor microenvironment changes in glioblastoma,
Cancer Cell,
2023,
ISSN 1535-6108,
https://doi.org/10.1016/j.ccell.2023.02.019

Usage

Example with G-SAM DE Genes:
`data('G.SAM.corrected.DE.genes.VST', package = 'recursiveCorPlot')`

Above: recursive correlation based clustering


Above: regular 1 - correlation based clustering


Above: scaled Euclidean distance based clustering

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
R		R
data		data
extern		extern
man		man
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
recursiveCorPlot.Rproj		recursiveCorPlot.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

recursiveCorPlot - natural clustering of RNA-seq data

Introduction

Installation

Citing recursiveCorPlot

Usage

About

Releases

Packages

Languages

License

yhoogstrate/recursiveCorPlot

Folders and files

Latest commit

History

Repository files navigation

recursiveCorPlot - natural clustering of RNA-seq data

Introduction

Installation

Citing recursiveCorPlot

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages