Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

etl-pipeline

Here are 2,659 public repositories matching this topic...

risingwave

Streaming data platform. Real-time stream processing, low-latency serving, and Iceberg table management.

  • UpdatedDec 18, 2025
  • Rust
unstract

No-code LLM Platform to launch APIs and ETL Pipelines to structure unstructured documents

  • UpdatedDec 17, 2025
  • Python
streampark

Make stream processing easier! Easy-to-use streaming application development framework and operation platform.

  • UpdatedNov 5, 2025
  • Java

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

  • UpdatedDec 6, 2025
  • Jupyter Notebook

Implementing best practices for PySpark ETL jobs and applications.

  • UpdatedJan 1, 2023
  • Python
Udacity-Data-Engineering-Projects

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

  • UpdatedAug 26, 2022
  • Python
goodreads_etl_pipeline

Enterprise-grade and API-first LLM workspace for unstructured documents, including data extraction, redaction, rights management, prompt playground, and more!

  • UpdatedDec 17, 2025
  • Python

A Clojure high performance data processing system

  • UpdatedDec 17, 2025
  • Clojure

A blazingly fast general purpose blockchain analytics engine specialized in systematic mev detection

  • UpdatedJul 28, 2025
  • Rust
FlashLearn

Integrate LLM in any pipeline - fit/predict pattern, JSON driven flows, and built in concurency support.

  • UpdatedMar 10, 2025
  • Python

A simplified, lightweight ETL Framework based on Apache Spark

  • UpdatedJan 24, 2024
  • Scala

The Supabase of AI era. A modular, open-source backend for building AI-native software — designed for knowledge, not static data.

  • UpdatedJun 5, 2025
  • TypeScript

An end-to-end data engineering pipeline that orchestrates data ingestion, processing, and storage using Apache Airflow, Python, Apache Kafka, Apache Zookeeper, Apache Spark, and Cassandra. All components are containerized with Docker for easy deployment and scalability.

  • UpdatedFeb 14, 2025
  • Python

Service for bulk-loading data to databases with automatic schema management (Redshift, Snowflake, BigQuery, ClickHouse, Postgres, MySQL)

  • UpdatedNov 14, 2025
  • Go

This is a template you can use for your next data engineering portfolio project.

  • UpdatedSep 10, 2021

Improve this page

Add a description, image, and links to theetl-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with theetl-pipeline topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp