- Notifications
You must be signed in to change notification settings - Fork831
🧙 Build, run, and manage data pipelines for integrating and transforming data.
License
mage-ai/mage-ai
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
🧙 A modern replacement for Airflow.
Documentation 🌪️ Get a 5 min overview 🌊 Play with live tool 🔥 Get instant help
Integrate and synchronize data from 3rd party sources
Build real-time and batch pipelines totransform data using Python, SQL, and R
Run, monitor, andorchestrate thousands of pipelines without losing sleep
1️⃣ 🏗️
Have you met anyone who said they loved developing in Airflow?
That’s why we designed an easy developer experience that you’ll enjoy.
↓
2️⃣ 🔮
Stop wasting time waiting around for your DAGs to finish testing.
Get instant feedback from your code each time you run it.
↓
3️⃣ 🚀
Don’t have a large team dedicated to Airflow?
Mage makes it easy for a single developer or small team to scale up and manage thousands of pipelines.
Mage is an open-source data pipeline tool for transforming and integrating data.
The recommended way to install the latest version of Mage is through Docker with the following command:
docker pull mageai/mageai:latest
You can also install Mage using pip or conda, though this may cause dependency issues without the proper environment.
pip install mage-ai
conda install -c conda-forge mage-ai
Looking for help? Thefastest way to get started is by checking out our documentationhere.
Looking for quick examples? Open ademo project right in your browser or check out ourguides.
Build and run a data pipeline with ourdemo app.
WARNING
The live demo is public to everyone, please don’t save anything sensitive (e.g. passwords, secrets, etc).
Click the image to play video
- Load data from API, transform it, and export it to PostgreSQL
- Integrate Mage into an existing Airflow project
- Train model on Titanic dataset
- Set up dbt models and orchestrate dbt runs
🎶 | Orchestration | Schedule and manage data pipelines with observability. |
📓 | Notebook | Interactive Python, SQL, & R editor for coding data pipelines. |
🏗️ | Data integrations | Synchronize data from 3rd party sources to your internal destinations. |
🚰 | Streaming pipelines | Ingest and transform real-time data. |
❎ | dbt | Build, run, and manage your dbt models with Mage. |
A sample data pipeline defined across 3 files ➝
- Load data ➝
@data_loaderdefload_csv_from_file():returnpd.read_csv('default_repo/titanic.csv')
- Transform data ➝
@transformerdefselect_columns_from_df(df,*args):returndf[['Age','Fare','Survived']]
- Export data ➝
@data_exporterdefexport_titanic_data_to_disk(df)->None:df.to_csv('default_repo/titanic_transformed.csv')
What the data pipeline looks like in the UI ➝
New? We recommend reading aboutblocks andlearning from ahands-on tutorial.
Every user experience and technical design decision adheres to these principles.
💻 | Easy developer experience | Open-source engine that comes with a custom notebook UI for building data pipelines. |
🚢 | Engineering best practices built-in | Build and deploy data pipelines using modular code. No more writing throwaway code or trying to turn notebooks into scripts. |
💳 | Data is a first-class citizen | Designed from the ground up specifically for running data-intensive workflows. |
🪐 | Scaling is made simple | Analyze and process large data quickly for rapid iteration. |
These are the fundamental concepts that Mage uses to operate.
Project | Like a repository on GitHub; this is where you write all your code. |
Pipeline | Contains references to all the blocks of code you want to run, charts for visualizing data, and organizes the dependency between each block of code. |
Block | A file with code that can be executed independently or within a pipeline. |
Data product | Every block produces data after it's been executed. These are called data products in Mage. |
Trigger | A set of instructions that determine when or how a pipeline should run. |
Run | Stores information about when it was started, its status, when it was completed, any runtime variables used in the execution of the pipeline or block, etc. |
Add features and instantly improve the experience for everyone.
Check out thecontributing guideto set up your development environment and start building.
Individually, we’re a mage.
🧙 Mage
Magic is indistinguishable from advanced technology.A mage is someone who uses magic (aka advanced technology).Together, we’re Magers!
🧙♂️🧙 Magers (
/ˈmājər/
)A group of mages who help each other realize their full potential!Let’s hang out and chat together ➝
For real-time news, fun memes, data engineering topics, and more, join us on ➝
![]() | |
![]() | |
![]() | GitHub |
![]() | Slack |
Check out ourFAQ page to find answers to some of our most asked questions.
See theLICENSE file for licensing information.
About
🧙 Build, run, and manage data pipelines for integrating and transforming data.