Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Options for setting up DVC for data / pipelines #36

Merged
merged 47 commits into from
Oct 3, 2024
Merged

Conversation

metazool
Copy link
Collaborator

@metazool metazool commented Sep 26, 2024

This PR became a grab-bag of small improvements after starting out as a set of different approaches to using DVC to manage data and to run simple, reproducible data pipelines.

It's too much of a mixed bag to expect anyone without time investment in this experimental project to review; tests pass, it's been a learning experience. There's more work to integrate over the top ( #38 ) that would be of direct short-term use to researchers, so I'm going to disable the review-required rule, merge this and move on.

* Uses import-url to do this
* Adds a bash script to automate it
* Documents the process
@metazool metazool marked this pull request as draft September 27, 2024 15:30
@metazool metazool changed the title Work in progress - options for setting up DVC for data / pipelines Options for setting up DVC for data / pipelines Sep 27, 2024
@metazool metazool requested review from a team and removed request for albags October 3, 2024 14:12
@metazool metazool linked an issue Oct 3, 2024 that may be closed by this pull request
@metazool metazool marked this pull request as ready for review October 3, 2024 14:32
@metazool metazool merged commit 8f39f39 into main Oct 3, 2024
2 checks passed
@metazool metazool deleted the dvc_dataset branch October 3, 2024 14:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confusion about the use of dask/xarray in this project
1 participant