lakehouse
Here are 309 public repositories matching this topic...
Language:All
Sort:Most stars
ClickHouse® is a real-time analytics database management system
- Updated
Feb 20, 2026 - C++
The official home of the Presto distributed SQL query engine for big data
- Updated
Feb 20, 2026 - Java
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance for multi-dimensional analytics, real-time analytics, and ad-hoc queries. A Linux Foundation project.
- Updated
Feb 20, 2026 - Java
Data Agent Ready Warehouse : One for Analytics, Search, AI, Python Sandbox. — rebuilt from scratch. Unified architecture on your S3.
- Updated
Feb 20, 2026 - Rust
LakeSoul is an end-to-end, realtime and cloud native Lakehouse framework with fast data ingestion, concurrent update and incremental data analytics on cloud storages for both BI and AI applications.
- Updated
Feb 13, 2026 - Java
World's most powerful open data catalog for building a high-performance, geo-distributed and federated metadata lake.
- Updated
Feb 17, 2026 - Java
YTsaurus is a scalable and fault-tolerant open-source big data platform.
- Updated
Feb 20, 2026 - C++
Real-time analytics on Postgres tables
- Updated
Dec 3, 2025 - Rust
Apache Polaris, the interoperable, open source catalog for Apache Iceberg
- Updated
Feb 20, 2026 - Java
Apache Fluss is a streaming storage built for real-time analytics.
- Updated
Feb 18, 2026 - Java
OLake - Fastest Databases, Kafka & S3 Replication to Apache Iceberg or Plain Parquet. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Supported sources : Postgres, MongoDB, MySQL, Oracle, MSSql, DB2, Kafka, S3.
- Updated
Feb 20, 2026 - Go
Lakekeeper is an Apache-Licensed, secure, fast and easy to use Apache Iceberg REST Catalog written in Rust.
- Updated
Feb 20, 2026 - Rust
DuckDB-powered data lake analytics from Postgres
- Updated
Mar 19, 2025 - Rust
A curated list of open source tools used in analytics platforms and data engineering ecosystem
- Updated
Mar 12, 2025
GigAPI is a Timeseries lakehouse for real-time data and sub-second queries, powered by DuckDB OLAP + Parquet Query Engine, Compactor w/ Cloud-Native Storage. Drop-in FDAP alternative ⭐
- Updated
Oct 20, 2025 - Go
Examples of using Terraform to deploy Databricks resources
- Updated
Feb 20, 2026 - HCL
Improve this page
Add a description, image, and links to thelakehouse topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thelakehouse topic, visit your repo's landing page and select "manage topics."