Add sophia-h optimizer #979

evanatyourservice · 2024-05-29T19:43:48Z

PR to add sophia optimizer. It's mostly based on levanter's implementation with some changes/added features here and there.

One note is that I had to change the contrib common test file a couple times, once to pass the loss_fn out of the parabola and rosenbrock functions (could be useful later for other optimizers that need loss function), and a second time to bypass the check for update arguments to be values (the loss function is not). Please advise if these changes are not ok or the most correct.

evanatyourservice · 2024-05-29T19:55:12Z

fixes #968

vroulet

Thank you very much @evanatyourservice! And sorry for the delay.
I left you some comments we can discuss about.

optax/contrib/_sophia_h.py

evanatyourservice · 2024-06-14T21:54:52Z

Hi Vincent, thank you for the notes! They all make perfect sense to me and I'll get to updating the code/answering them tomorrow

fabianp · 2024-06-27T10:30:52Z

@evanatyourservice please ping us whenever you're ready for another round of reviews :-)

evanatyourservice · 2024-06-27T15:02:10Z

@fabianp Will do! Sorry been moving but will try to get this going asap

fabianp · 2024-06-28T09:21:50Z

there's no rush, just wanted to make sure you were not waiting on us :-)

evanatyourservice · 2024-06-28T16:33:44Z

@vroulet @fabianp Got some updates pushed, let me know if anything needs to be changed! Thanks

vroulet · 2024-09-14T01:09:11Z

Hello @evanatyourservice,
Sorry for the very long delay on our end. I think your code looks great!
If it's ok with you, can you merge the code with head once #1060 is merged (#1060 adds more tests for the contrib optimizers to ensure compatibilities). Then I should approve and finish on our side if there are still minor details to fine-tune.
Thank you again!

evanatyourservice · 2024-09-16T18:21:00Z

sounds good!

fabianp · 2024-09-17T10:28:54Z

optax/contrib/_sophia.py

+      raise ValueError("obj_fn must be provided to hutchinson update function.")
+    del updates
+    key, subkey = jax.random.split(state.key)
+    random_signs = otu.tree_random_like(


As far as I can tell from the paper (https://arxiv.org/pdf/2305.14342, section 2.3), it computes the Hutchinson estimator using a Normal distribution, while here we use a Rademacher distribution. The Rademacher distribution should be lower variance, but perhaps it's worth to add a comment in that this deviates from what is specified in the paper?

vroulet · 2024-09-23T17:10:28Z

Hello @evanatyourservice,
#1060 got merged. You may merge the tests, address Fabian's comment, and I can approve (and maybe fix minor issues on our end if there are).
Thank you again!

evanatyourservice · 2024-10-01T00:04:03Z

Ok sounds good! Sorry I should get to this tomorrow

evanatyourservice added 2 commits May 29, 2024 13:32

sophia h optimizer

db65f12

fix line lengths

c838f34

evanatyourservice added 2 commits May 30, 2024 16:27

Update _sophia_h.py

ae0e718

Update _sophia_h.py

32c1e87

vroulet reviewed Jun 14, 2024

View reviewed changes

evanatyourservice added 2 commits June 16, 2024 09:25

hess diag var name, int seed, state descrip, tree_random_like

7124951

arg dtypes, seed type, remove separate key for devices

33f4525

fabianp mentioned this pull request Jun 27, 2024

Intended usage of the Sophia optimiser #968

Closed

separate out hessian diagonal fn, make customizable

9d73e9f

line indent fix

bb737ff

fabianp reviewed Sep 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sophia-h optimizer #979

Add sophia-h optimizer #979

evanatyourservice commented May 29, 2024

evanatyourservice commented May 29, 2024

vroulet left a comment

evanatyourservice commented Jun 14, 2024

fabianp commented Jun 27, 2024

evanatyourservice commented Jun 27, 2024

fabianp commented Jun 28, 2024

evanatyourservice commented Jun 28, 2024

vroulet commented Sep 14, 2024

evanatyourservice commented Sep 16, 2024

fabianp Sep 17, 2024

vroulet commented Sep 23, 2024

evanatyourservice commented Oct 1, 2024

Add sophia-h optimizer #979

Are you sure you want to change the base?

Add sophia-h optimizer #979

Conversation

evanatyourservice commented May 29, 2024

evanatyourservice commented May 29, 2024

vroulet left a comment

Choose a reason for hiding this comment

evanatyourservice commented Jun 14, 2024

fabianp commented Jun 27, 2024

evanatyourservice commented Jun 27, 2024

fabianp commented Jun 28, 2024

evanatyourservice commented Jun 28, 2024

vroulet commented Sep 14, 2024

evanatyourservice commented Sep 16, 2024

fabianp Sep 17, 2024

Choose a reason for hiding this comment

vroulet commented Sep 23, 2024

evanatyourservice commented Oct 1, 2024