Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate mydec sales into vw_pin_sale #588

Open
wants to merge 127 commits into
base: master
Choose a base branch
from

Conversation

wagnerlmichael
Copy link
Member

@wagnerlmichael wagnerlmichael commented Sep 9, 2024

This PR uses brings in mydec sales into default.vw_pin_sale only if the doc_no does not already exist in iasworld. We coalesce columns with priority given to iasworld, and add a new column called source which lets us know which table the sale was pulled from; iasworld or mydec.

If a mydec doc_no does not exist in iasworld, but does exist in mydec, we will bring that sale into this view. If the doc_no exists in both tables we will use the iasworld doc_no.

We (discussed with @jeancochrane ) also changed the error threshold for the this test since we are letting in a bunch of sales that were previously no longer in iasworld. It pushes the test failures from 3997 to 4600.

We will need to make a large amount of PRs in other repos to accommodate this. Most likely a bunch of where source ='iasworld' filters.

View at FROM z_ci_583_create_a_sales_view_that_combines_iasworld_mydec_and_ccrd_sales_default.vw_pin_sale

Quick integrity checks

Total counts

SELECT source, count(*)
FROM "z_ci_583_create_a_sales_view_that_combines_iasworld_mydec_and_ccrd_sales_default"."vw_pin_sale"
GROUP BY source;

1 | mydec | 205367
2 | iasworld | 2496073

. . .

select count(*) from default.vw_pin_sale

1 | 2496073

Filter deed type

-- deed_type_filter
SELECT count(*)
FROM "z_ci_583_create_a_sales_view_that_combines_iasworld_mydec_and_ccrd_sales_default"."vw_pin_sale"
where
source = 'iasworld'
and sale_filter_deed_type

1 | 116921

. . . 

select count(*) from default.vw_pin_sale
where sale_filter_deed_type

1 | 116921

Filter same sale within 365 days

This figure is different because after a talk with Billy it was decided that we should be
using doc_no's from mydec in the calculation of this filter even for ias doc_no observations.

select count(*), sale_filter_same_sale_within_365
from "z_ci_583_create_a_sales_view_that_combines_iasworld_mydec_and_ccrd_sales_default"."vw_pin_sale"
where source = 'iasworld'
group by sale_filter_same_sale_within_365

1	2480605	false
2	15468	true


. . .

select count(*), sale_filter_same_sale_within_365
from "z_ci_sale_filter_fix_default"."vw_pin_sale"
group by sale_filter_same_sale_within_365


1 | 13188 | true
2 | 2482885 | false

is_mydec_date check

select count(*), is_mydec_date
from "z_ci_583_create_a_sales_view_that_combines_iasworld_mydec_and_ccrd_sales_default"."vw_pin_sale"
where source = 'iasworld'
group by is_mydec_date

1	913170	true
2	1582903	false

. . .

select count(*), is_mydec_date
from "default"."vw_pin_sale"
group by is_mydec_date

1	913170	true
2	1582903	false

Sales by year counts

select count(*), year
from "z_ci_583_create_a_sales_view_that_combines_iasworld_mydec_and_ccrd_sales_default"."vw_pin_sale"
where source = 'iasworld'
group by year

5	75533	2023
21	101313	2022
9	106914	2021
28	113805	2020
16	141078	2019
25	127821	2018
27	127386	2017
31	127655	2016

. . .

select count(*), year
from default.vw_pin_sale
group by year

#	_col0	year
11	75533	2023
28	101313	2022
8	106914	2021
32	113805	2020
21	141078	2019
1	127821	2018
31	127386	2017
15	127655	2016

@wagnerlmichael wagnerlmichael marked this pull request as ready for review October 11, 2024 15:39
@wagnerlmichael wagnerlmichael requested a review from a team as a code owner October 11, 2024 15:39
dbt/models/default/docs.md Outdated Show resolved Hide resolved
dbt/models/default/default.vw_pin_sale.sql Outdated Show resolved Hide resolved
dbt/models/default/default.vw_pin_sale.sql Show resolved Hide resolved
dbt/models/default/default.vw_pin_sale.sql Show resolved Hide resolved
)
/* Some sales in mydec have multiple rows for one pin on a given sale date.
Sometimes they have different dates than iasworld prior to 2021 and when
joined back onto unique_sales will create duplicates by pin/sale date. */
WHERE num_single_day_sales = 1
OR (YEAR(mydec_date) > 2020)
OR (YEAR(sale_date) > 2020)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably just use the year column we create in the inner query, huh

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in this change makes sense?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in we define year a few lines above this, let's just use that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take care of this before we merge.

dbt/models/default/default.vw_pin_sale.sql Outdated Show resolved Hide resolved
dbt/models/default/default.vw_pin_sale.sql Outdated Show resolved Hide resolved
@wrridgeway
Copy link
Member

Just gonna nibble around the edges on this for the first bite. I'll look at the data and inspect the filters once this round of changes is made.

@wagnerlmichael
Copy link
Member Author

Just gonna nibble around the edges on this for the first bite. I'll look at the data and inspect the filters once this round of changes is made.

Impressively mouthy sentence

)
/* Some sales in mydec have multiple rows for one pin on a given sale date.
Sometimes they have different dates than iasworld prior to 2021 and when
joined back onto unique_sales will create duplicates by pin/sale date. */
WHERE num_single_day_sales = 1
OR (YEAR(mydec_date) > 2020)
OR (YEAR(sale_date) > 2020)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's take care of this before we merge.

dbt/models/default/default.vw_pin_sale.sql Show resolved Hide resolved
Copy link
Member

@wrridgeway wrridgeway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few tiny suggestions, we're really close.

dbt/models/default/default.vw_pin_sale.sql Outdated Show resolved Hide resolved
dbt/models/default/default.vw_pin_sale.sql Show resolved Hide resolved
dbt/models/default/default.vw_pin_sale.sql Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Create a sales view that combines iasWorld, myDec, and CCRD sales
3 participants