You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. Thanks for making this tool, it's really useful!
I was wondering if you can give any recommendations for what to set as the minimum number of maximum iterations depending on how many cells/samples a user might have while balancing computational efficiency.
I have 810k cells in my MERFISH data with around 110 samples, and my k stability plot at max_iter=50 is wildly different than when I set max_iter=1000. However, at max_iter=1000, the autok.stability matrix has 90 columns, not 1000. Does that mean it took 90 iterations to reach convergence_tol?
Is there a way to estimate a reasonable number of iterations to use when running tol.ClusterAutoK to maximize stability/convergence while balancing iterations? I suspect it depends on how many cells/samples are being run, but some guidance on how to select the optimal number of iterations would be greatly appreciated.
Thank you!!
The text was updated successfully, but these errors were encountered:
The convergence of the stability depends a lot on the dataset.
For this reason, I implemented the convergence_tol parameter. It basically checks the values of stability between adjacent iterations, and if the curve doesn't change by a certain relative amount (computed using Mean Average Percentage Error), it stops.
If at the end of the execution, you have 90 columns, it's very likely that you reached the convergence after 90 iterations. In theory, you should also see a warning when convergence is reached.
The reason why iter=50 is very different from iter=90 is exactly why the method doesn't stop earlier :) Instead, I assume that iter=89 and iter=90 will be quite similar.
If not, probably you should decrease the convergence_tol (for example, from 1e-2 to 1e-3).
In general, I would suggest relying mainly on convergence_tol, rather than max_iter, as it's much more dataset-independent. For example, in all of my tests, I always reached convergence in less than 5-10 iterations, which is quite different than your 90.
On the other hand, with convergence_tol you know that if the process stopped automatically at 90 iterations, the curve didn't change much between consecutive iterations.
Description of feature
Hi. Thanks for making this tool, it's really useful!
I was wondering if you can give any recommendations for what to set as the minimum number of maximum iterations depending on how many cells/samples a user might have while balancing computational efficiency.
I have 810k cells in my MERFISH data with around 110 samples, and my k stability plot at max_iter=50 is wildly different than when I set max_iter=1000. However, at max_iter=1000, the autok.stability matrix has 90 columns, not 1000. Does that mean it took 90 iterations to reach convergence_tol?
Is there a way to estimate a reasonable number of iterations to use when running tol.ClusterAutoK to maximize stability/convergence while balancing iterations? I suspect it depends on how many cells/samples are being run, but some guidance on how to select the optimal number of iterations would be greatly appreciated.
Thank you!!
The text was updated successfully, but these errors were encountered: