Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Automate DataLoader as a default option #168

Open
GaoxiangLuo opened this issue Jul 6, 2022 · 0 comments
Open

[Feature] Automate DataLoader as a default option #168

GaoxiangLuo opened this issue Jul 6, 2022 · 0 comments
Labels
enhancement New feature or request

Comments

@GaoxiangLuo
Copy link
Collaborator

GaoxiangLuo commented Jul 6, 2022

Is your feature request related to a problem? Please describe.
As the configuration file of dataset has an unused tag dataFormat, it'll be convenient to automate basic data loading after downloading data from the url as default while users can still customize their personalized data loading. This will be useful when it comes to the production scale and doesn't require very fine-detailed and specific pre-processing, so they can use the default option.

Describe the solution you'd like
If a dataset is npy format, it will read the numpy array only. If a dataset is npz format, it will read the numpy arrays as a dictionary with headers as keys, and arrays as values. If a dataset is csv or stata format, it will read it as a pandas DataFrame. If a dataset is zip format, it will unzip it. If a dataset is a python pickle format, it will load the content from it. If a dataset consists of images for a classification task, it will construct the dataset by using folder names as their labels. This requires the users put the data into the right sub-folders.

Describe alternatives you've considered
A data loader is an essential competent of a ML task. Besides the default option the system provides, users can bypass the default option and overwrite it with their customized data loaders.

@GaoxiangLuo GaoxiangLuo added the enhancement New feature or request label Jul 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant