Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit DVC deps to include cmd run files #240

Merged
9 changes: 9 additions & 0 deletions dvc.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ stages:
- assessment
- input
outs:
- pipeline/00-ingest.R
- input/assessment_data.parquet
- input/char_data.parquet
- input/complex_id_data.parquet
Expand All @@ -23,6 +24,7 @@ stages:
Train a LightGBM model with cross-validation. Generate model objects,
data recipes, and predictions on the test set (most recent 10% of sales)
deps:
- pipeline/01-train.R
- input/training_data.parquet
params:
- cv
Expand Down Expand Up @@ -58,6 +60,7 @@ stages:
County. Also generate flags, calculate land values, and make any
post-modeling changes
deps:
- pipeline/02-assess.R
- input/training_data.parquet
- input/assessment_data.parquet
- input/complex_id_data.parquet
Expand Down Expand Up @@ -86,6 +89,7 @@ stages:
2. An assessor-specific ratio study comparing estimated assessments to
the previous year's sales
deps:
- pipeline/03-evaluate.R
- output/test_card/model_test_card.parquet
- output/assessment_pin/model_assessment_pin.parquet
params:
Expand All @@ -109,6 +113,7 @@ stages:
Generate SHAP values for each card and feature as well as feature
importance metrics for each feature
deps:
- pipeline/04-interpret.R
- input/assessment_data.parquet
- input/training_data.parquet
- output/assessment_card/model_assessment_card.parquet
Expand All @@ -134,6 +139,7 @@ stages:
Save run timings and run metadata to disk and render a performance report
using Quarto.
deps:
- pipeline/05-finalize.R
- output/intermediate/timing/model_timing_train.parquet
- output/intermediate/timing/model_timing_assess.parquet
- output/intermediate/timing/model_timing_evaluate.parquet
Expand Down Expand Up @@ -164,6 +170,7 @@ stages:
outputs prior to upload and attach a unique run ID. This step requires
access to the CCAO Data AWS account, and so is assumed to be internal-only
deps:
- pipeline/06-upload.R
- output/parameter_final/model_parameter_final.parquet
- output/parameter_range/model_parameter_range.parquet
- output/parameter_search/model_parameter_search.parquet
Expand All @@ -189,6 +196,8 @@ stages:
Generate Desk Review spreadsheets and iasWorld upload CSVs from a finished
run. NOT automatically run since it is typically only run once. Manually
run once a model is selected
deps:
- pipeline/07-export.R
params:
- assessment.year
- input.min_sale_year
Expand Down