This project builds a scalable data warehouse tech-stack that will help to provide an AI service to a client. The Data used for this project is sensor data in csv format. In Data (ucdavis.edu) you can find parquet data, and or sensor data in CSV. ELT pipeline that extracts data from the links in the website, stages them in MYSQL Database, and transforms data using DBT. This whole process is automated using Airflow.
- Project: Build A Data Warehouse using MYSQL, Airflow and DBT
- Table of Contents
- Project Structure
- ELT Pipeline
- License
airflow
|___dags
|____create_station_Summary.sql # database/table creation using MYSQL
|____insert_station_summary.sql # sql file for loading data
|____load_data_airflow.py # loads data
|____dbt_airflow.py # transforms data using dbt
DBT
|___models
|____merged_station.sql # sql file for transforming tables
ELT pipeline builder
create_tables
- create tables using MYSQL and automates using airflow
load_tables
- Load raw data from CSV Dataframe to staging tables and automates using airflow
Transforms table using sql files and automates using airflow