-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The CVI re-use optimiser at each prod call, and this behaviour can be not suitable for some optimisers #303
Comments
Its a good point, is there an API to reset As a workaround for now you can use a custom callback for your optimization procedure: function use_adam_callback(λ, ∇)
opt = Flux.Adam(0.007)
return ReactiveMP.cvi_update!(opt, λ, ∇)
end cvi = CVI(rng, 1, 1000, use_adam_callback, ForwardDiffGrad(), 10, Val(true), true) You can generalize this pattern in a structure and do smth like cvi = CVI(rng, 1, 1000, ResetOptimizer(() -> Flux.Adam(0.007)), ForwardDiffGrad(), 10, Val(true), true) where struct ResetOptimizer{C}
callback::C
end
function cvi_update!(opt::ResetOptimizer, λ::NaturalParameters, ∇::NaturalParameters)
return cvi_update!(opt.callback(), λ, ∇)
end |
Thanks! Ideally, I want to reset the optimizer not between iterations but between different |
I will look. I do not know But If it would be possible to have an O parameter, not an optimizer but as a fabric of optimizers it should fix the issue anyway. |
We discussed with @albertpod and we think this issue is quite severe and the current constructor for CVI is error-prone. This behaviour either must be documented explicitly or maybe instead we should disallow creating the CVI constructor with an actual optimizer and use the factory pattern indeed. We still can change things given that we don't have a lot of users depending on the current implementation. Better to change it now than later. I like the factory pattern approach but maybe we can simply call |
@Nimrais Have you looked into the |
@bvdmitri Yes, I did. The new version of Flux's training code has been written as an independent package called Optimisers.jl. It includes a functionality that can solve our problem, called |
I have taken some time to think, and in my opinion, there are two important points at which users can decide to reinitialise or to adjust the training hyperparameters:
So, ideally, as I see it, we need a structure in which we can implement two callbacks: |
I've added the |
This has been fixed a while ago. |
Imagine you have an optimizer that has an inner state and somehow changes its parameters with each iteration (e.g., Adam).
If you will try to use it in several optimization tasks with a shared state: the convergence of each following task is not granted.
So if CVI is used in several consequent products with such an optimizer, it can start brake at some point.
The following code starts to find a bad approximation from some point:
This code works fine:
Should we give a user an ability to reset an optimizer between iterations or prod calls?
The text was updated successfully, but these errors were encountered: