data-integration
Here are 483 public repositories matching this topic...
Language:All
Sort:Most stars
Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
- Updated
Apr 4, 2025 - Python
Turns Data and AI algorithms into production-ready web applications in no time.
- Updated
Apr 3, 2025 - Python
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
- Updated
Apr 4, 2025 - Python
An orchestration platform for the development, production, and observation of data assets.
- Updated
Apr 4, 2025 - Python
SeaTunnel is a next-generation super high-performance, distributed, massive data integration tool.
- Updated
Apr 3, 2025 - Java
🧙 Build, run, and manage data pipelines for integrating and transforming data.
- Updated
Apr 1, 2025 - Python
The developer first cloud governance platform
- Updated
Apr 3, 2025 - Go
Flink CDC is a streaming data integration tool
- Updated
Apr 3, 2025 - Java
Upserts, Deletes And Incremental Processing on Big Data.
- Updated
Apr 4, 2025 - Java
Lean and mean distributed stream processing system written in rust and web assembly. Alternative to Kafka + Flink in one.
- Updated
Apr 3, 2025 - Rust
Jitsu is an open-source Segment alternative. Fully-scriptable data ingestion engine for modern data teams. Set-up a real-time data pipeline in minutes, not days
- Updated
Apr 3, 2025 - TypeScript
Privacy and Security focused Segment-alternative, in Golang and React
- Updated
Apr 4, 2025 - Go
Community-curated list of software packages and data resources for single-cell, including RNA-seq, ATAC-seq, etc.
- Updated
Mar 27, 2025
ingestr is a CLI tool to copy data between any databases with a single command seamlessly.
- Updated
Apr 1, 2025 - Python
Apache DevLake is an open-source dev data platform to ingest, analyze, and visualize the fragmented data from DevOps tools, extracting insights for engineering excellence, developer experience, and community growth.
- Updated
Mar 31, 2025 - Go
A lightweight opinionated ETL framework, halfway between plain scripts and Apache Airflow
- Updated
Dec 15, 2023 - Python
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.
- Updated
Jan 1, 2024 - Java
Hop Orchestration Platform
- Updated
Apr 3, 2025 - Java
Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…
- Updated
Aug 10, 2022 - JavaScript
Improve this page
Add a description, image, and links to thedata-integration topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedata-integration topic, visit your repo's landing page and select "manage topics."