-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add dbt documentation to sale.*
assets
#202
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,19 +1,67 @@ | ||
# flag | ||
|
||
{% docs flag %} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Thought, non-blocking] I like the way you've been using prefixes to hint at the purpose of column descriptions, i.e. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've already started to do that in the upcoming #203! So we're definitely on the same page. |
||
This table holds the flag information from the sales val program. | ||
PIN-level sales validation flags created by | ||
[model-sales-val](https://github.com/ccao-data/model-sales-val). | ||
|
||
This is the primary sales validation output table. Flags within this table | ||
should be possible to reconstruct using the other sales validation tables: | ||
`sale.group_mean`, `sale.parameter`, and `sale.metadata`. | ||
|
||
**Primary Key**: `meta_sale_document_number`, `run_id`, `version` | ||
dfsnow marked this conversation as resolved.
Show resolved
Hide resolved
|
||
{% enddocs %} | ||
|
||
# foreclosure | ||
|
||
{% docs foreclosure %} | ||
Foreclosure data ingested from Illinois Public Records (RIS). | ||
|
||
**Primary Key**: `pin`, `document_number` | ||
{% enddocs %} | ||
|
||
# parameter | ||
|
||
{% docs parameter %} | ||
This table holds information about the specifications used to flag outliers in the sales val program. | ||
Parameters used for each run of | ||
[model-sales-val](https://github.com/ccao-data/model-sales-val), | ||
including the statistical bounds, groupings, window sizes, etc. | ||
|
||
**Primary Key**: `run_id` | ||
{% enddocs %} | ||
|
||
# group_mean | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've added markdown headers to the doc assets just to make them easier to read, navigate, and fold. |
||
|
||
{% docs group_mean %} | ||
This table holds group mean information which we can utilize to explain exactly why an outlier was flagged. | ||
Information about groups used to calculate statistical deviations | ||
for sales validation. | ||
|
||
**Primary Key**: `run_id`, `group` | ||
{% enddocs %} | ||
|
||
# metadata | ||
|
||
{% docs metadata %} | ||
View to help the upload process of sales validation flags into iasWorld. | ||
Information about the code used for a sales validation run, as well as | ||
the start time and type of run. | ||
|
||
**Primary Key**: `run_id` | ||
{% enddocs %} | ||
|
||
# mydec | ||
|
||
{% docs mydec %} | ||
MyDec data from the Illinois Department of Revenue (IDOR). Includes property | ||
transfer declarations (sales) used to fill in missing data in `iasworld.sales` | ||
and as an input to sales validation flagging. | ||
|
||
**Primary Key**: `document_number`, `year_of_sale` | ||
{% enddocs %} | ||
|
||
# vw_ias_salesval_upload | ||
|
||
{% docs vw_ias_salesval_upload %} | ||
View to help the upload process of sales validation flags into iasWorld. | ||
{% enddocs %} | ||
View for sales validation outputs to create an upload format compatible | ||
with iasWorld. | ||
|
||
**Primary Key**: `salekey`, `run_id` | ||
{% enddocs %} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,17 +1,159 @@ | ||
sources: | ||
- name: sale | ||
tags: | ||
- load_auto | ||
tables: | ||
- name: flag | ||
description: '{{ doc("flag") }}' | ||
- name: parameter | ||
description: '{{ doc("parameter") }}' | ||
tags: | ||
- load_auto | ||
|
||
columns: | ||
- name: ptax_flag_original | ||
description: | | ||
Whether or not this sale was flagged on Q10 of the | ||
PTAX-203 form (regardless of statistical deviation) | ||
- name: meta_sale_document_number | ||
description: '{{ doc("shared_column_document_number") }}' | ||
- name: rolling_window | ||
description: | | ||
Rolling window period used to calculate statistics | ||
for flagging this sale | ||
- name: run_id | ||
description: '{{ doc("shared_column_sv_run_id") }}' | ||
- name: sv_is_heuristic_outlier | ||
description: '{{ doc("shared_column_sv_is_heuristic_outlier") }}' | ||
- name: sv_is_ptax_outlier | ||
description: '{{ doc("shared_column_sv_is_ptax_outlier") }}' | ||
- name: sv_is_outlier | ||
description: '{{ doc("shared_column_sv_is_outlier") }}' | ||
- name: sv_outlier_type | ||
description: '{{ doc("shared_column_sv_outlier_type") }}' | ||
- name: version | ||
description: '{{ doc("shared_column_sv_version") }}' | ||
|
||
- name: foreclosure | ||
description: '{{ doc("foreclosure") }}' | ||
tags: | ||
- load_manual | ||
Comment on lines
+33
to
+36
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. [Question, non-blocking] Do we want docs for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Probably, but there are a billion columns and I don't want to add docs for them right now. I added this as a subtask to #201. |
||
|
||
- name: group_mean | ||
description: '{{ doc("group_mean") }}' | ||
tags: | ||
- load_auto | ||
|
||
columns: | ||
- name: group | ||
description: | | ||
Group string used as a unique identifier. | ||
|
||
Typically a combination of year, township, and class | ||
- name: group_size | ||
description: Number of properties in the group | ||
- name: run_id | ||
description: '{{ doc("shared_column_sv_run_id") }}' | ||
- name: mean_price | ||
description: Mean price of the group, in FMV | ||
- name: mean_price_per_sqft | ||
description: Mean price per sqft (of building) of the group, in FMV | ||
|
||
- name: parameter | ||
description: '{{ doc("parameter") }}' | ||
tags: | ||
- load_auto | ||
|
||
columns: | ||
- name: condo_stat_groups | ||
description: | | ||
Groups used to calculate flagging statistics (std. dev.) | ||
for condominium (class 299, 399) properties | ||
- name: dev_bounds | ||
description: | | ||
Boundaries for standard deviation flagging. | ||
|
||
Sales with prices beyond these boundaries are flagged. | ||
- name: earliest_data_ingest | ||
description: | | ||
Date of earliest sale used in validation. | ||
|
||
This inclusive of the rolling window period used for | ||
calculating statistical groups. In other words, if the earliest | ||
sale to-be-flagged is 2013-12-01 and the rolling window period | ||
is 9 months, then the earliest sale *used* would be 2013-03-01 | ||
- name: iso_forest_cols | ||
description: Columns used as features in the isolation forest model | ||
- name: latest_data_ingest | ||
description: Date of latest sale used in validation | ||
- name: min_group_thresh | ||
description: | | ||
Minimum number of sales required for statistical flagging. | ||
|
||
If the minimum number of sales in our group methodology | ||
(township, class, rolling window) is below N, these sales | ||
are not flagged and are set to `Not outlier` | ||
- name: ptax_sd | ||
description: | | ||
Boundaries for standard deviation flagging in combination | ||
with a PTAX-203 flag | ||
- name: res_stat_groups | ||
description: | | ||
Groups used to calculate flagging statistics (std. dev.) | ||
for residential (class 2) properties | ||
- name: rolling_window | ||
description: | | ||
Rolling window size, in months. | ||
|
||
For each target sale, calculate statistics (std. dev., | ||
group size) using all sales in the period N months prior, | ||
inclusive of the month of the sale itself | ||
- name: run_id | ||
description: '{{ doc("shared_column_sv_run_id") }}' | ||
- name: sales_flagged | ||
description: | | ||
Total number of sales flagged. | ||
|
||
Inclusive of both sales flagged as outliers *and* sales | ||
flagged as non-outliers | ||
- name: short_term_owner_threshold | ||
description: | | ||
Properties with a significant price change and multiple | ||
sales within this time duration (in days) are flagged | ||
|
||
|
||
- name: metadata | ||
description: '{{ doc("metadata") }}' | ||
tags: | ||
- load_auto | ||
|
||
columns: | ||
- name: long_commit_sha | ||
description: Full commit SHA of the code used for the model run | ||
- name: run_id | ||
description: '{{ doc("shared_column_sv_run_id") }}' | ||
- name: run_timestamp | ||
description: Start timestamp of the model run | ||
- name: run_type | ||
description: | | ||
Type of model run. | ||
|
||
Variable can be one of `initial_flagging`, `recurring`, | ||
or `manual_update` | ||
- name: short_commit_sha | ||
description: Short commit SHA of the code used for the model run | ||
|
||
- name: mydec | ||
description: '{{ doc("mydec") }}' | ||
tags: | ||
- load_manual | ||
|
||
models: | ||
- name: sale.vw_ias_salesval_upload | ||
description: '{{ doc("vw_ias_salesval_upload") }}' | ||
|
||
columns: | ||
- name: run_id | ||
description: '{{ doc("shared_column_sv_run_id") }}' | ||
- name: salekey | ||
description: '{{ doc("shared_column_sale_key") }}' | ||
- name: sv_is_outlier | ||
description: '{{ doc("shared_column_sv_is_outlier") }}' | ||
- name: sv_outlier_type | ||
description: '{{ doc("shared_column_sv_outlier_type") }}' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just capitalizing MyDec correctly in all places in the documentation. Didn't want to put it in a separate PR.