-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make upload directory in S3 bucket configurable #254
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- adds command line args to specify upload target directories/directory pattern - adds a command line arg to show variables that could be used in a directory pattern - makes the legacy path accessible (i.e., can be used in the directory pattern) - slighly modifies some variable names to make their purpose more clear - fixes small typo - makes script a bit more 'noisy' to let it tell what it does (should facilitate debugging; as some actions are more configurable it's desirable to have a bit more information about what is going on)
- read settings from config - parse settings - pass settings to upload script - adjust messages used for reporting via GitHub PR comments
…nto configurable_upload_directory
- added new settings `metadata_prefix` and `tarball_prefix` - added new settings for reporting about pull request download failures - added missing settings for build permissions and reporting about no permissions
bedroge
requested changes
Feb 26, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @trz42 , looks good to me. Just have some tiny comments.
bedroge
approved these changes
Feb 26, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the final PR to sync bot codes used by NESSI and EESSI. Essentially it provides two parts:
pr_comment_id
to simplify handling of issue comments #250, andbot: status
command)README.md
to include new and missing settingsThe PR has been extensively tested through multiple PRs with both a bot instance used for development and several production bot instances (NESSI). See
Changes to the upload location
Current practice in EESSI and NESSI
In EESSI, both the tarball and the metadata file are uploaded to the same unique path
In NESSI, we use a slightly modified directory structure. The tarball is placed at
The metadata file is placed at
The motivation for splitting the location for the tarball and the metadata file was to improve the efficiency of the ingestion procedure. In EESSI, the ingestion procedure scans for the whole S3 bucket for new tarballs and compares the found tarballs to Git branches in a staging repository on GitHub. The EESSI procedure also moved the metadata file in the staging repository to different top-level directories corresponding to the status of the ingestion. This procedure frequently consumed the available GitHub REST API requests (per hour). It was adjusted in NESSI such that most changes to the staging repository are done locally (in a Git repository on "disk") and only strictly necessary updates are pushed to the repository. Also, instead of moving the metadata file in the staging repository it was moved in the S3 bucket to keep track of the status of an ingestion.
Suggested changes (this PR) on the bot-side
To synchronise bot codes but accommodate for different ingestion procedures and needs in different scenarios, the initial idea for this PR was to make only a part of the upload location configurable. In particular,
tarballs
andnew
could be made configurable by settingstarball_dir
andmetadata_dir
, respectively. However, we saw an opportunity to make this even more flexible and thus also make a first step towards future efficiency improvements of the ingestion procedure. Issues that may benefit from this PR areSo, the suggested changes feature the following capabilities:
${github_repository}
which is expanded to the full name of the repository, e.g.,EESSI/software-layer
${pull_request_number}
which is expanded to the number of the pull request in which the tarball was created${legacy_aws_path}
which is expanded to the legacy path used in EESSI