Skip to content

nebasam/Data-Warehouse-Tech-Stack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project: Build A Data Warehouse using MYSQL, Airflow and DBT

This project builds a scalable data warehouse tech-stack that will help to provide an AI service to a client. The Data used for this project is sensor data in csv format. In Data (ucdavis.edu) you can find parquet data, and or sensor data in CSV. ELT pipeline that extracts data from the links in the website, stages them in MYSQL Database, and transforms data using DBT. This whole process is automated using Airflow.

Table of Contents

Project Structure

airflow
|___dags
  |____create_station_Summary.sql    # database/table creation using MYSQL
  |____insert_station_summary.sql    # sql file for loading data
  |____load_data_airflow.py          # loads data 
  |____dbt_airflow.py                # transforms data using dbt 
DBT
|___models
  |____merged_station.sql            # sql file for transforming tables

ELT Pipeline

load_data_airflow.py

ELT pipeline builder

  1. create_tables
    • create tables using MYSQL and automates using airflow
  2. load_tables
    • Load raw data from CSV Dataframe to staging tables and automates using airflow

dbt_airflow.py

Transforms table using sql files and automates using airflow

Built With

License

MIT

Releases

No releases published

Packages

No packages published

Languages