Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data catalog] Fast follow: Enable bucket versioning #131

Closed
jeancochrane opened this issue Sep 14, 2023 · 1 comment
Closed

[Data catalog] Fast follow: Enable bucket versioning #131

jeancochrane opened this issue Sep 14, 2023 · 1 comment
Assignees

Comments

@jeancochrane
Copy link
Contributor

When dbt recreates a model, it starts by dropping the Athena table/view for that model and the partitions that are associated with it in the Athena metadata. This means that by bringing our CTAS queries into the DAG (#111), we will be introducing an operation that destroys data, with potential for downtime during periods of data recreation.

As a first step to help address this data destruction, we should enable bucket versioning on the prod dbt data bucket so we can roll back to older data in case of a problem.

Blocked by #111.

@jeancochrane jeancochrane added this to the Data catalog MVP milestone Sep 14, 2023
@jeancochrane jeancochrane self-assigned this Sep 14, 2023
@jeancochrane
Copy link
Contributor Author

Done! I also added a lifecycle rule to delete noncurrent object versions after 90 days, while always retaining the one most recent noncurrent version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant