Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

open_archive only top folder for zipfile #36

Open
ghost opened this issue May 21, 2020 · 0 comments
Open

open_archive only top folder for zipfile #36

ghost opened this issue May 21, 2020 · 0 comments

Comments

@ghost
Copy link

ghost commented May 21, 2020

  1. tar.gz and zip files quite often contain nested archives. tarfile recursively extracts files, zipfile doesn't and requires an extra step. This leaves the artifact contents incompletely extracted.
  2. In either case folder and final file locations aren't clear by default.
  3. We may want to log all the nested file contents as artifacts, particularly if they are tables, if they are layered images (image_blue, image_red, image_green) we may want to generate a metadata description summarizing these findings...
    (idea: if 3. is the case, and the files represent keys and data, normalized database tables, the extract process might also recommend possible joins)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants