Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Optuna experiment stream processing #2435

Open
Zhylkaaa opened this issue Oct 24, 2022 · 3 comments · May be fixed by #2461
Open

[Feature Request] Optuna experiment stream processing #2435

Zhylkaaa opened this issue Oct 24, 2022 · 3 comments · May be fixed by #2461
Labels
enhancement Enhanvement request launchers optuna plugin Plugins realted issues sweep

Comments

@Zhylkaaa
Copy link

Zhylkaaa commented Oct 24, 2022

🚀 Feature Request

Delegate experiments scheduling to launcher and add feedback loop from launcher to sweeper/experiment generator to allow for stream processing with ProcessPoolExecutors and similar. Mainly to increase utilization of resources and avoid waiting for all batch processes to finish.

Motivation

Is your feature request related to a problem? Please describe.
My team uses optuna for hp sweeps of ML model with different training time, so if for example we use 8 gpu server and train 8 models in parallel with eg joblib launcher we can waste up to 1/4 walltime waiting for all batch jobs to finish.

Pitch

Describe the solution you'd like
I would like to propose creating experiment generator class (so it's customizable). Sole porpoise of this class is returning new configuration and receiving experiment results to update study.
I have my implementation based on generator's send method and it's working pretty good with my own loky launcher(aka concurrent.features.ProcessPoolExecutor) which I can also contribute.

Describe alternatives you've considered

it's the only way we can achieve high gpu utilization.
Alternatively we can adopt optuna plugin to accept futures and manage them, but I think job launching and result awaiting is a role of launcher
Are you willing to open a pull request? (See CONTRIBUTING)
Yes I can open pull request with my implementation if this feature is interesting

Additional context

Add any other context or screenshots about the feature request here.

@Zhylkaaa Zhylkaaa added the enhancement Enhanvement request label Oct 24, 2022
@Jasha10
Copy link
Collaborator

Jasha10 commented Oct 24, 2022

add feedback loop from launcher to sweeper/experiment generator to allow for stream processing with ProcessPoolExecutors and similar

What sort of feedback loop did you have in mind?

I would like to propose creating experiment generator class

How this would fit together with the sweeper and launcher? What is the interface for communication between them?

it's the only way we can achieve high gpu utilization.

Do you have a working prototype?

@Zhylkaaa
Copy link
Author

Zhylkaaa commented Oct 24, 2022

Hi @Jasha10
We were thinking of an object that serves as a proxy between study and launcher.
Specifically we implemented this extending Generator and using generator.send(). We expect the generator to return 3 value tuple with (index, trial, overrides) and receive (via send) 3 value tuple with (indexes, trials, results) of experiments that have already finished.

Yes we have POC implemented with joblib and custom loky launcher and optuna sweeper, but if we come to an agreement we can adopt all sweepers and launchers to new api.
I will open a PR and link it hear

@Zhylkaaa
Copy link
Author

Hi @Jasha10 to state that this isn't stale: we need to pass internal reviews and will try to send it ASAP when we have clearance.

@Zhylkaaa Zhylkaaa linked a pull request Nov 10, 2022 that will close this issue
@Jasha10 Jasha10 linked a pull request Dec 5, 2022 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhanvement request launchers optuna plugin Plugins realted issues sweep
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants