Skip to content

Codebase for CCAO data infrastructure construction and management

Notifications You must be signed in to change notification settings

ccao-data/data-architecture

Repository files navigation

CCAO Data Infrastructure

This repository stores the code for the CCAO Data Department's ETL pipelines and data lakehouse. This infrastructure supports the Data Team's modeling, reporting, and data integrity work.

Quick Links

Repository Structure

  • ./dbt contains the models and tests that build our Athena data lakehouse; dbt mainly acts as a transformation and documentation layer on top of our raw data
  • ./docs contains design documents and other supplemental documentation
  • ./etl contains ETL scripts used to load raw and slightly cleaned up data into the lakehouse as dbt sources
  • ./socrata contains column transformations for the CCAO's Open Data Portal assets