Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Dockerfile and build-and-run-model workflow for CI model runs #9

Conversation

jeancochrane
Copy link
Contributor

@jeancochrane jeancochrane commented Nov 9, 2023

This PR adds a new build-and-run-model workflow to build, run, and cleanup the model on GitHub Actions. The workflow is configured to run on PRs and workflow dispatch. On pushes to the main branch, the workflow will rebuild the container image for the main branch, but will not run or cleanup the model.

This PR mirrors the following PRs in model-res-avm:

Note also that in order to test a full model run, we switch to using the loc_cook_municipality_name feature instead of loc_tax_municipality_name. This is not actually the correct feature, and we'll need to switch it back in #4, but loc_tax_municipality_name is not yet present in the training data and properly refreshing the training data is outside the scope of this issue.

Closes #5.

Testing

This PR is pinned to the main branch of the ccao-data/actions repo, which will not contain the reusable workflow code that this branch needs until ccao-data/actions#1 gets merged. This means that the most recent build-and-run-model run will appear not to be successful. For an example of a successful model run, see this workflow which was run on the corresponding feature branch of ccao-data/actions containing the reusable workflow code that this repo needs.

@jeancochrane jeancochrane force-pushed the jeancochrane/5-infra-updates-copy-res-model-infra-updates-to-the-condo-model branch from 82518a4 to 299b40b Compare November 9, 2023 22:12
@jeancochrane jeancochrane force-pushed the jeancochrane/5-infra-updates-copy-res-model-infra-updates-to-the-condo-model branch from 3a182bd to 7fff710 Compare November 9, 2023 22:55
@jeancochrane jeancochrane force-pushed the jeancochrane/5-infra-updates-copy-res-model-infra-updates-to-the-condo-model branch from 8e5a417 to a5a64f5 Compare November 13, 2023 20:40
with:
vcpu: "16.0"
memory: "65536"
role-duration-seconds: 14400 # Worst-case time for a full model run
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably not correct, but I'm setting it to match the value for the res model for now until we improve model performance and get a better sense of our worst-case times.

README.Rmd Show resolved Hide resolved
@@ -47,7 +47,7 @@ stages:
cache: false
- output/workflow/recipe/model_workflow_recipe.rds:
cache: false
frozen: true
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure this step was erroneously marked as frozen; we don't cache intermediate outputs so freezing the train step leads to an error. Billy seemed to think this was reasonable but let me know if I'm off base!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely an error. The only step that should really ever be frozen is the ingest step.

@@ -9,6 +9,7 @@
suppressPackageStartupMessages({
library(arrow)
library(aws.s3)
library(aws.ec2metadata)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we discovered in ccao-data/model-res-avm#26, this package is necessary in order to allow the aws.s3 package to authenticate using credentials in an ECS environment.

@@ -152,12 +152,12 @@ the 2023 assessment model.
| Percent Population Mobility, Moved From Within Same County in Past Year | acs5 | numeric | |
| Longitude | loc | numeric | |
| Latitude | loc | numeric | |
| Municipality Name | loc | character | |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just an artifact of removing the var_name_model != "loc_tax_municipality_name" conditional in README.Rmd above. I'm not sure why it changes the order of the feature in the table, but it represents the reverse of what happened in #4.

@@ -106,11 +106,9 @@ ccao::vars_dict %>%
)
) %>%
mutate(`Unique to Condo Model` = ifelse(
var_name_model != "loc_tax_municipality_name" & (
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We no longer need this conditional now that both repos are (temporarily) using loc_cook_municipality_name.

@jeancochrane jeancochrane marked this pull request as ready for review November 14, 2023 18:29
@jeancochrane
Copy link
Contributor Author

Oops, I forgot to pin to the main branch of the build-and-run-batch-job action before requesting review! Done in a852f22, so this should be ready for review.

Copy link
Member

@dfsnow dfsnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @jeancochrane. Let's get merges going for the actions repo + this, then do test runs of the res + condo pipelines just to make sure everything is working.

@@ -47,7 +47,7 @@ stages:
cache: false
- output/workflow/recipe/model_workflow_recipe.rds:
cache: false
frozen: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely an error. The only step that should really ever be frozen is the ingest step.

…h of ccao-data/actions""""

This reverts commit 3f2114e.
@jeancochrane jeancochrane merged commit 6cdc38f into master Nov 14, 2023
1 check passed
@jeancochrane jeancochrane deleted the jeancochrane/5-infra-updates-copy-res-model-infra-updates-to-the-condo-model branch November 14, 2023 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Infra updates] Copy res model infra updates to the condo model
2 participants