airflow-dag
Here are 22 public repositories matching this topic...
Language:All
Sort:Most stars
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
- Updated
Mar 9, 2020 - Python
This repository is no longer maintained.
- Updated
Mar 10, 2022 - Python
This project demonstrates how to build and automate an ETL pipeline written in Python and schedule it using open source Apache Airflow orchestration tool on AWS EC2 instance.
- Updated
Mar 14, 2025 - Python
The script automates the collection and insertion of KPIs related to transaction time and storage usage in a Data Warehouse, using Apache Airflow. It calculates the time elapsed since the last transaction and the percentage of storage usage, recording this data periodically in specific tables.
- Updated
Feb 3, 2025 - Python
An airflow DAG transformation framework
- Updated
Jul 10, 2020 - Python
Analysing live tweets from twitter by generating a big data pipeline and scheduling it with Airflow (Using also Kafka for tweet ingestion, Cassandra for storing parsed tweets, and Spark for Analysis)
- Updated
Aug 11, 2020 - Python
Build a data warehouse from scratch, including full load, daily incremental load, design schema, SCD Type 1 and 2.
- Updated
Feb 1, 2023 - Python
This is an ELT data pipeline setup to track the activities of an e-commerce website based on orders, reviews, deliveries and shipment date. This project utilized technologies like Airflow, AWS RDS-Postgres, Python etc.
- Updated
Jul 5, 2024 - Python
Фабрика DAG
- Updated
Aug 15, 2024 - Python
Apache Airflow demo project that setup 3 DAGs to explain how to pass parameters from a DAG to a triggered DAG.
- Updated
Jan 7, 2023 - Python
Orchestrate data pipeline using airflow
- Updated
Jul 12, 2024 - Python
An end-to-end GoodReads Data Pipeline for Building Data Lake, Data Warehouse and Analytics Platform.
- Updated
Jan 8, 2024 - Python
1T Data "Data architect (DevOps)". Задание 2024-08-21 6.8
- Updated
Aug 27, 2024 - Python
This project focuses on implementing an ETL pipeline using Apache Airflow to efficiently extract data from Reddit, transform it as needed, and load it into an AWS S3 bucket. The use of Airflow allows for robust orchestration of the data workflow, ensuring that each step of the ETL process is executed in a reliable and repeatable manner.
- Updated
Oct 30, 2024 - Python
Datitos - TP2 with steroids
- Updated
Feb 17, 2022 - Jupyter Notebook
Airflow powered ETL pipeline for moving Near-Earth-Object data from NASA to Google Cloud
- Updated
Feb 12, 2025 - Python
Data Engineering Projects on data modelling, data warehousing, data lake development, orchestration and analysis
- Updated
Sep 15, 2020 - Jupyter Notebook
Creation of the almost-real time data processing pipeline for the Pintrest posts.
- Updated
Jan 29, 2024 - Jupyter Notebook
Small project to play around Apache Airflow and ETL
- Updated
Apr 7, 2022 - Jupyter Notebook
Improve this page
Add a description, image, and links to theairflow-dag topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with theairflow-dag topic, visit your repo's landing page and select "manage topics."