Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding annotation of potential doublets to anndata obs #193

Merged
merged 17 commits into from
Oct 22, 2024

Conversation

ptajvar
Copy link
Contributor

@ptajvar ptajvar commented Oct 15, 2024

During the graph step, we split some components that appear to be technical multiplets through community detection. However we set parameters in a relatively conservative way so to avoid breaking up cells into multiple components. In the annotation step we are adding a less conservative parametrization to mark "potential doublets".

What is added:

  • is_potential_doublet as a column in adata.
  • n_edges_to_split_doublet as a column in adata: how many edges (molecules) should be removed to split the two (or more) detected sub-communities in the potential doublet.
  • fraction_potential_doublets to the json report of the annotation step. (rate of is_potential_doublet)
  • n_edges_to_split_potential_doublets to the json report of the annotation step. (sum of n_edges_to_split_doublet)

Fixes: EXE-2025

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

The unit tests.

PR checklist:

  • This comment contains a description of changes (with reason).
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • I have checked my code and documentation and corrected any misspellings
  • I have documented any significant changes to the code in CHANGELOG.md

@ptajvar ptajvar force-pushed the feature/exe-2025-annotate-potential-doublets branch from e79fbf4 to e08516d Compare October 21, 2024 10:33
@ptajvar ptajvar marked this pull request as ready for review October 21, 2024 15:25
Copy link
Contributor

@ambarrio ambarrio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some comments to take into account, but approving to not delay a post review merge once they are answered or tackled.

src/pixelator/annotate/__init__.py Outdated Show resolved Hide resolved
@@ -433,4 +437,12 @@ def anndata_metrics(adata: AnnData) -> AnnotateAnndataStatistics:
if "doublet_size_threshold" in adata.uns:
metrics["doublet_size_threshold"] = adata.uns["doublet_size_threshold"]

if "is_potential_doublet" in adata.obs:
metrics["fraction_potential_doublets"] = adata.obs[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are no is_potential_doublet here, what will it happen when run mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There will be a key error, that's why we check for it first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But when adata.obs["is_potential_doublet"] returns empty and we called mean() - what happens then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only run the mean() function if "is_potential_doublet" exists as a column in adata.obs. Otherwise that part is not run and "fraction_potential_doublets" remains the default value which is 0 right now but following the discussion we're going to change it to None.

metrics["fraction_potential_doublets"] = adata.obs[
"is_potential_doublet"
].mean()
metrics["n_edges_to_split_potential_doublets"] = adata.obs[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with sum

src/pixelator/pixeldataset/utils.py Outdated Show resolved Hide resolved
Copy link
Contributor

@johandahlberg johandahlberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had some small suggestions regarding the type annotations. I think the code is clear and simple. Nice job!

src/pixelator/annotate/__init__.py Outdated Show resolved Hide resolved
src/pixelator/pixeldataset/utils.py Show resolved Hide resolved
src/pixelator/pixeldataset/utils.py Outdated Show resolved Hide resolved
src/pixelator/pixeldataset/utils.py Outdated Show resolved Hide resolved
src/pixelator/pixeldataset/utils.py Outdated Show resolved Hide resolved
@ptajvar ptajvar merged commit 7ca7168 into dev Oct 22, 2024
14 checks passed
@ptajvar ptajvar deleted the feature/exe-2025-annotate-potential-doublets branch October 22, 2024 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants