You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not every run created by the modeling pipeline(s) actually needs to be kept. Many runs are testing some sort of CI or pipeline infrastructure change and aren't serious candidates for model selection. These runs pollute the Athena model.* tables and cost S3 storage. We should delete them when possible.
This year, to make things easier, we should create a dedicated GitHub Actions workflow to delete erroneous model runs. The workflow should meet the following requirements:
Manual dispatch only with a deployment environment check
Takes a model run_id as an input
Tightly scoped IAM permissions to only allow recent (>= 2024) runs to be deleted
The text was updated successfully, but these errors were encountered:
dfsnow
changed the title
[Infra updates] Create a manually dispatch GitHub Actions workflow to delete model runs
[Infra updates] Create a manually dispatched GitHub Actions workflow to delete model runs
Nov 14, 2023
Not every run created by the modeling pipeline(s) actually needs to be kept. Many runs are testing some sort of CI or pipeline infrastructure change and aren't serious candidates for model selection. These runs pollute the Athena
model.*
tables and cost S3 storage. We should delete them when possible.Manually deleting all the artifacts of a model run from the relevant S3 buckets is kind of a pain, so in the past we used a helper function (https://github.com/ccao-data/model-res-avm/blob/master/R/helpers.R#L35-L64) to delete unneeded runs.
This year, to make things easier, we should create a dedicated GitHub Actions workflow to delete erroneous model runs. The workflow should meet the following requirements:
run_id
as an inputThe text was updated successfully, but these errors were encountered: