-
-
Notifications
You must be signed in to change notification settings - Fork 541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to limit the memory usage of AMF #1454
Comments
Hey there. Just curious, how deep are your trees? How much memory are you consuming? Do you know? |
I don't know how deep my trees are; it might be infinite because I couldn't find a parameter in the settings to control it. I used around 20,000 data points with 61 dimensions for online training. After that, I saved the model locally using joblib. I noticed that the size of the model is approximately 500MB.These are my settings: |
Dear tree-hugger @smastelini, would you have some spare time to look into this? I think it's worthwhile to get a better understanding of how fast/deep Mondrian trees grow :) |
Hey everyone, I will do my best. The vanilla Mondrian trees had a budget parameter if I recall correctly. I am not that familiarized with Aggregated Mondrian Trees, but I'll do my homework. |
Thank you for your quick response! I appreciate your willingness to look into it. If you discover any information about the budget parameter for Aggregated Mondrian Trees during your research, I would be interested to learn more. Looking forward to any updates you can provide. Thanks again! |
Any updates on this? If memory usage is unbounded, using this model online in production could result in eventual memory collapse. |
No, not yet. I started reading the paper but so far I haven't found any type of "budget" parameter. Unfortunately my time is currently scarce, so I cannot delve in depth into this topic right now. |
A small update. I finished skimming through the paper and from what I get, the theoretical robustness guarantees of the algorithm and its adaptive nature are the factors that should provide an automatic cap on the memory usage. There is no direct control from the user standpoint as far as I am concerned. The idea is the algorithm would (eventually) adapt and converge while avoiding overfitting. This last aspect could be the main source of excessive memory usage, as far as decision tree structures are concerned. I want to get a more practical understanding of AMFs by delving into the original code and the River adaptation. This could help me form a more practical and solid opinion from an application viewpoint. |
Versions
0.19.0
python=3.8
ubuntu 18.04
Issue
AMF integrated with River is very useful, but the unlimited growth of memory size restricts practical application. Could you please tell me how to limit the memory size of AMF?
The text was updated successfully, but these errors were encountered: