-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there a way we can add in our own reference as training data #46
Comments
Hi @Sudheshna30, what do you mean exactly by using our own reference? Do you mean for generating the embedding or for clustering samples? For fitting the clustering model on a dataset and then clustering on a different dataset you can use I hope I understood the question, let me know if you meant something else :) |
Thank you for responding Marco! I really appreciate it!
Im interested in the second method of clustering model on a reference
dataset and apply that knowledge to the actual dataset. We tried with the
ceelcharter on our pancreatic cosmx dataset and didn't see good results of
clustering so Im looking into see if we can actually train the model on a
reference dataset and use that to cluster the original dataset.
can you help me with an example on how to apply tl.cluster.fit and
tl.Cluster.predict
<https://cellcharter.readthedocs.io/en/latest/generated/cellcharter.tl.Cluster.html#cellcharter.tl.Cluster.predict>
?
Best
…On Tue, Jul 2, 2024 at 3:37 AM Marco ***@***.***> wrote:
Hi @Sudheshna30 <https://github.com/Sudheshna30>, what do you mean
exactly by using our own reference?
Do you mean for generating the embedding or for clustering samples?
For the first one you can simply train your own scVI or trVAE model using
the official tutorials of the packages.
For fitting the clustering model on a dataset and then clustering on a
different dataset you can use
You can use tl.Cluster.fit
<https://cellcharter.readthedocs.io/en/latest/generated/cellcharter.tl.Cluster.html#cellcharter.tl.Cluster.fit>
on the first dataset and then tl.Cluster.predict
<https://cellcharter.readthedocs.io/en/latest/generated/cellcharter.tl.Cluster.html#cellcharter.tl.Cluster.predict>
on the other one.
I hope I understood the question, let me know if you meant something else
:)
—
Reply to this email directly, view it on GitHub
<#46 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZC5HITY2TBJKX347C2JFHTZKJKC7AVCNFSM6AAAAABKCB2EX6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMBSGE4DEMZQG4>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @Sudheshna30. If I can ask, what was not good in your results for the pancreatic CosMx? You are actually the second person who told me that CellCharter didn't work so well on pancreatic tissue, so I am curious about whether there is something specific in the tissue structure that requires different parameters for CellCharter. Regarding fit and predict you can look at the CosMx tutorial . There I used them on the same dataset but nothing prevents you from processing the two datasets in the same way and using So basically what you would do is:
However, this implies that there are no strong batch effects between the reference dataset and your datasets, otherwise the features from scVI trained on the reference dataset will not work well for your dataset. It may be a bit of work and not necessarily help a lot unless the reference dataset is quite similar to your dataset, so as I mentioned at the beginning I suggest you to share with me why you think the results are not good, so that we can figure out together how to improve it instead of using a reference dataset. |
After interacting privately I want to clarify a common misconception that I am seeing people have with CellCharter, even though it should be clear by reading the paper. CellCharter has not been initially designed to find cell types but to find cell niches, which are areas with the same combination of cell types and cell states. You can identify cell types by running it with n_layers=0 and it could be convenient because it's very scalable, but this is not its original purpose. |
Description of feature
Adding our own reference would be a great way to run this pipeline.
The text was updated successfully, but these errors were encountered: