Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
#

data-lake

Here are 272 public repositories matching this topic...

lakeFS

data load tool (dlt) is an open source Python library that makes data loading easy 🛠️

  • UpdatedMar 20, 2025
  • Python

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

  • UpdatedMar 20, 2025
  • Scala

BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

  • UpdatedJan 1, 2024
  • Java
Udacity-Data-Engineering-Projects

Few projects related to Data Engineering including Data Modeling, Infrastructure setup on cloud, Data Warehousing and Data Lake development.

  • UpdatedAug 26, 2022
  • Python
goodreads_etl_pipeline

Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licensed under Apache 2.0. Contributed by Teradata Inc.

  • UpdatedJan 12, 2023
  • Java

Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.

  • UpdatedMar 20, 2025
  • Rust

Generic Data Ingestion & Dispersal Library for Hadoop

  • UpdatedMar 19, 2023
  • Java
data-lakes-on-aws

Enterprise-grade, production-hardened, serverless data lake on AWS

  • UpdatedMar 18, 2025
  • Python

Real Time Big Data / IoT Machine Learning (Model Training and Inference) with HiveMQ (MQTT), TensorFlow IO and Apache Kafka - no additional data store like S3, HDFS or Spark required

  • UpdatedNov 5, 2020
  • Jupyter Notebook
amazon-s3-find-and-forget

Amazon S3 Find and Forget is a solution to handle data erasure requests from data lakes stored on Amazon S3, for example, pursuant to the European General Data Protection Regulation (GDPR)

  • UpdatedMar 5, 2025
  • Python

BtrBlocks: Efficient Columnar Compression for Data Lakes (SIGMOD 2023 Paper)

  • UpdatedMay 7, 2024
  • C++

U-SQL Examples and Issue Tracking

  • UpdatedMar 28, 2023
  • C#
wren-engine

🤖 The semantic engine for LLMs, bringing semantic context to AI agents. 🔥

  • UpdatedMar 20, 2025
  • Java

Resources for video demonstrations and blog posts related to DataOps on AWS

  • UpdatedJan 26, 2022
  • Python

An efficient storage and compute engine for both on-prem and cloud-native data analytics.

  • UpdatedMar 18, 2025
  • Java

Improve this page

Add a description, image, and links to thedata-lake topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thedata-lake topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp