Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataexamples tidy #1752

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
Open

dataexamples tidy #1752

wants to merge 11 commits into from

Conversation

casperdcl
Copy link
Member

@casperdcl casperdcl commented Mar 13, 2024

Changes

OOP tidy of #1712

Testing you performed

Related issues/links

Checklist

  • I have performed a self-review of my code
  • I have added docstrings in line with the guidance in the developer guide
  • I have updated the relevant documentation
  • I have implemented unit tests that cover any new or modified functionality
  • CHANGELOG.md has been updated with any functionality change
  • Request review from all relevant developers
  • Change pull request label to 'Waiting for review'

Contribution Notes

Please read and adhere to the developer guide and local patterns and conventions.

  • The content of this Pull Request (the Contribution) is intentionally submitted for inclusion in CIL (the Work) under the terms and conditions of the Apache-2.0 License
  • I confirm that the contribution does not violate any intellectual property rights of third parties

@casperdcl casperdcl force-pushed the dataexamples-tidy branch 3 times, most recently from 976dd26 to bc3bef2 Compare March 13, 2024 16:54
'''
URL = 'https://zenodo.org/records/4912435/files/small.zip'
FILE_SIZE = '227 MB'
Copy link
Member Author

@casperdcl casperdcl Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hrobarts I think you're missing a @classmethod def get(...) here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious which file you would want to get from the sandstone dataset. I was thinking of just having the download function then the user can load whichever one they need. That's how I would use it in the how-to. But maybe that's a bad idea. What do you think @paskino ?

Copy link
Member Author

@casperdcl casperdcl Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could add an extra argument to the get function here to select the file?

Comment on lines 447 to 451
loader = ZEISSDataReader(file_name=filepath)
return loader.read()
except FileNotFoundError as exc:
raise ValueError(f"Specify a different data_dir or download data with `{cls.__name__}({data_dir}).download_data()`") from exc
Copy link
Member Author

@casperdcl casperdcl Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw @hrobarts why not this instead? It will prompt if not already downloaded...

Suggested change
try:
loader = ZEISSDataReader(file_name=filepath)
return loader.read()
except FileNotFoundError as exc:
raise ValueError(f"Specify a different data_dir or download data with `{cls.__name__}({data_dir}).download_data()`") from exc
self.download_data()
loader = ZEISSDataReader(file_name=filepath)
return loader.read()

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had some discussion about wanting to keep the download separate from get so the user has to actively request the download. But I agree as we have the prompt before downloading this would be neater...

Copy link
Member Author

@casperdcl casperdcl Mar 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also could have def get(..., missing='prompt' / 'download' / 'error') to control (default) behaviour

hrobarts and others added 2 commits March 14, 2024 11:28
Co-authored-by: Casper da Costa-Luis <casper.dcl@physics.org>
Signed-off-by: Hannah Robarts <77114597+hrobarts@users.noreply.github.com>
return res == "y"

def download_data(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we sometimes want to just download the data without getting it e.g. in the how-to for the readers, the purpose is to show the loading step. In that case I think we need to keep data_dir as an argument and maybe keep this as a class method??

@casperdcl
Copy link
Member Author

FYI zenodo_get>=1.6.1 will be on conda-forge as soon as conda-forge/zenodo_get-feedstock#10 is merged. It's already on PyPI (pip).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: PRs to review
Development

Successfully merging this pull request may close these issues.

2 participants