Skip to content

Commit

Permalink
[pre-commit.ci] auto fixes from pre-commit.com hooks
Browse files Browse the repository at this point in the history
for more information, see https://pre-commit.ci
  • Loading branch information
pre-commit-ci[bot] committed Aug 19, 2024
1 parent b9b0f3d commit 8b8cb53
Show file tree
Hide file tree
Showing 5 changed files with 16 additions and 18 deletions.
1 change: 0 additions & 1 deletion .github/workflows/test.yaml.rej
Original file line number Diff line number Diff line change
Expand Up @@ -13,4 +13,3 @@ diff a/.github/workflows/test.yaml b/.github/workflows/test.yaml (rejected hunks
+ python: "3.12"
pip-flags: "--pre"
name: PRE-RELEASE DEPENDENCIES

18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,15 +39,15 @@ CellCharter is able to automatically identify spatial domains, and offers a suit

## Features

- **Identify niches for multiple samples**: By combining the power of scVI and scArches, CellCharter can identify domains for multiple samples simultaneously, even with in presence of batch effects.
- **Scalability**: CellCharter can handle large datasets with millions of cells and thousands of features. The possibility to run it on GPUs makes it even faster
- **Flexibility**: CellCharter can be used with different types of spatial -omics data, such as spatial transcriptomics, proteomics, epigenomics and multiomics data. The only difference is the method used for dimensionality reduction and batch effect removal.
- Spatial transcriptomics: CellCharter has been tested on [scVI](https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.SCVI.html#scvi.model.SCVI) with Zero-inflated negative binomial distribution.
- Spatial proteomics: CellCharter has been tested on a version of [scArches](https://docs.scarches.org/en/latest/api/models.html#scarches.models.TRVAE), modified to be use Mean Squared Error loss instead of the default Negative Binomial loss.
- Spatial epigenomics: CellCharter has been tested on [scVI](https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.SCVI.html#scvi.model.SCVI) with Poisson distribution.
- Spatial multiomics: it's possible to use multi-omics models such as [MultiVI](https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.MULTIVI.html#scvi.model.MULTIVI), or use the concatenativo of the results from the different models.
- **Best candidates for number of domains**: CellCharter offers a [method to find multiple best candidates](https://cellcharter.readthedocs.io/en/latest/generated/cellcharter.tl.ClusterAutoK.html) for the number of domains, based on the stability of a certain number of domains across multiple runs.
- **Domain characterization**: CellCharter provides a set of tools to characterize and compare the spatial domains, such as domain proportion, cell type enrichment, (differential) neighborhood enrichment, domain shape characterization.
- **Identify niches for multiple samples**: By combining the power of scVI and scArches, CellCharter can identify domains for multiple samples simultaneously, even with in presence of batch effects.
- **Scalability**: CellCharter can handle large datasets with millions of cells and thousands of features. The possibility to run it on GPUs makes it even faster
- **Flexibility**: CellCharter can be used with different types of spatial -omics data, such as spatial transcriptomics, proteomics, epigenomics and multiomics data. The only difference is the method used for dimensionality reduction and batch effect removal.
- Spatial transcriptomics: CellCharter has been tested on [scVI](https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.SCVI.html#scvi.model.SCVI) with Zero-inflated negative binomial distribution.
- Spatial proteomics: CellCharter has been tested on a version of [scArches](https://docs.scarches.org/en/latest/api/models.html#scarches.models.TRVAE), modified to be use Mean Squared Error loss instead of the default Negative Binomial loss.
- Spatial epigenomics: CellCharter has been tested on [scVI](https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.SCVI.html#scvi.model.SCVI) with Poisson distribution.
- Spatial multiomics: it's possible to use multi-omics models such as [MultiVI](https://docs.scvi-tools.org/en/stable/api/reference/scvi.model.MULTIVI.html#scvi.model.MULTIVI), or use the concatenativo of the results from the different models.
- **Best candidates for number of domains**: CellCharter offers a [method to find multiple best candidates](https://cellcharter.readthedocs.io/en/latest/generated/cellcharter.tl.ClusterAutoK.html) for the number of domains, based on the stability of a certain number of domains across multiple runs.
- **Domain characterization**: CellCharter provides a set of tools to characterize and compare the spatial domains, such as domain proportion, cell type enrichment, (differential) neighborhood enrichment, domain shape characterization.

Since CellCharter 0.3.0, we moved the implementation of Gaussian Mixture Model (GMM) from [PyCave](https://github.com/borchero/pycave), not mainted anymmore, to [TorchGMM](https://github.com/CSOgroup/torchgmm), a fork of PyCave mantained by the CSOgroup. This change allows us to have a more stable and mantained implementation of GMM that is compatible with the most recent versions of PyTorch.

Expand Down
1 change: 0 additions & 1 deletion docs/index.md.rej
Original file line number Diff line number Diff line change
Expand Up @@ -6,4 +6,3 @@ diff a/docs/index.md b/docs/index.md (rejected hunks)
-template_usage.md
contributing.md
references.md

13 changes: 7 additions & 6 deletions src/cellcharter/tl/_autok.py
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ def __init__(
self.similarity_function = similarity_function if similarity_function else fowlkes_mallows_score
self.stability = []

def fit(self, adata: ad.AnnData, use_rep: str = 'X_cellcharter'):
def fit(self, adata: ad.AnnData, use_rep: str = "X_cellcharter"):
"""
Cluster data multiple times for each number of clusters (K) in the selected range and compute the average stability for each them.
Expand All @@ -97,19 +97,20 @@ def fit(self, adata: ad.AnnData, use_rep: str = 'X_cellcharter'):
use_rep
Key in :attr:`anndata.AnnData.obsm` to use as data to fit the clustering model. If ``None``, uses :attr:`anndata.AnnData.X`.
"""

if use_rep not in adata.obsm:
raise ValueError(f"{use_rep} not found in adata.obsm. If you want to use adata.X, set use_rep=None")

X = adata.obsm[use_rep] if use_rep is not None else adata.X


self.labels = defaultdict(list)
self.best_models = {}

random_state = self.model_params.pop("random_state", 0)

if ("trainer_params" not in self.model_params) or ("enable_progress_bar" not in self.model_params["trainer_params"]):
if ("trainer_params" not in self.model_params) or (
"enable_progress_bar" not in self.model_params["trainer_params"]
):
self.model_params["trainer_params"] = {"enable_progress_bar": False}

previous_stability = None
Expand Down Expand Up @@ -176,10 +177,10 @@ def best_k(self) -> int:
stability_mean = np.array([np.mean(self.stability[k]) for k in range(len(self.n_clusters[1:-1]))])
best_idx = np.argmax(stability_mean)
return self.n_clusters[best_idx + 1]

@property
def peaks(self) -> List[int]:
""" Find the peaks in the stability curve. """
"""Find the peaks in the stability curve."""
if self.max_runs <= 1:
raise ValueError("Cannot compute stability with max_runs <= 1")
stability_mean = np.array([np.mean(self.stability[k]) for k in range(len(self.n_clusters[1:-1]))])
Expand Down
1 change: 0 additions & 1 deletion src/cellcharter/tl/_gmm.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@
import scipy.sparse as sps
import torch
from pytorch_lightning import Trainer
from torchgmm import set_logging_level
from torchgmm.base.data import (
DataLoader,
TensorLike,
Expand Down

0 comments on commit 8b8cb53

Please sign in to comment.