This repository stores the code for the CCAO Data Department's ETL pipelines and data lakehouse. This infrastructure supports the Data Team's modeling, reporting, and data integrity work.
- 📁 dbt Data Catalog - Documentation for all CCAO data lakehouse tables and views
- 🔩 dbt README - How to develop CCAO data infrastructure using dbt
- 🧪 dbt Tests and QC Reports - How to add and run data tests, unit tests, and QC reports using dbt
- 📝 dbt Generic Test Documentation - Definitions for CCAO generic dbt tests, which are functions that we use to define our QC tests
- ./dbt contains the models and tests that build our Athena data lakehouse; dbt mainly acts as a transformation and documentation layer on top of our raw data
- ./docs contains design documents and other supplemental documentation
- ./etl contains ETL scripts used to load raw and slightly cleaned up data into the lakehouse as dbt sources
- ./socrata contains column transformations for the CCAO's Open Data Portal assets