Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add batch_size = 'all' option? #159

Closed
john-veillette opened this issue Jan 3, 2021 · 9 comments
Closed

add batch_size = 'all' option? #159

john-veillette opened this issue Jan 3, 2021 · 9 comments

Comments

@john-veillette
Copy link

Hey devs, thanks for this package.

I see that **kwargs is being deprecated from .fit, which makes sense in the interest of making the API more consistent for grid search, etc. However, it still seems desirable to be able to specify batch_size based on the input dimensions in case users want to use vanilla rather than stochastic gradient descent. Maybe adding a batch_size = 'all' option in init would remove any need to specify batch_size in fit?

@adriangb
Copy link
Owner

adriangb commented Jan 3, 2021

Hi 👋, glad you're finding it useful!

Would all=X.shape[0]?

@john-veillette
Copy link
Author

Yep! To replace the option to specify model.fit(X, y, batch_size = X.shape[0]) when fit(X, y, **kwargs) is removed, one could specify batch_size = 'all' on initialization. Should only take a few lines of code to implement.

@adriangb
Copy link
Owner

adriangb commented Jan 3, 2021

Agreed. Will leave this issue open to track.

@john-veillette
Copy link
Author

Thanks!

@stsievert
Copy link
Collaborator

Thats a decent argument for keeping fit **kwargs. I think compatibility with Keras is important.

@adriangb
Copy link
Owner

adriangb commented Feb 15, 2021

Hey @john-veillette, I haven't implemented this yet because like @stsievert points out above, we may ended up keeping **kwargs.

That said, I do think there are some use cases where **kwargs doesn't cut it. The first that comes to mind is cross-val-score, where presumably the length of the dataset is changing but you have no control over the sizes / cannot pass `**kwargs.

For your use case, do you feel that you actually need batch_size="all" or could **kwargs work for you?

Edit: I did a sample implementation, for reference, in #194.

@adriangb
Copy link
Owner

In addition to the implementation in #194, this could also be implemented via #167. Reference implementation here: https://www.adriangb.com/scikeras/refs/pull/167/merge/notebooks/DataTransformers.html#7.-Dynamically-setting-batch_size

@john-veillette
Copy link
Author

john-veillette commented Feb 16, 2021

For my use case, **kwargs works just as well! That's how I currently have it implemented, anyhow.

@adriangb
Copy link
Owner

adriangb commented Mar 6, 2021

Implemented in #194

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants