hdfs
Here are 1,021 public repositories matching this topic...
Language:All
Sort:Most stars
SeaweedFS is a fast distributed storage system for blobs, objects, files, and data lake, for billions of files! Blob store has O(1) disk seek, cloud tiering. Filer supports Cloud Drive, xDC replication, Kubernetes, POSIX FUSE mount, S3 API, S3 Gateway, Hadoop, WebDAV, encryption, Erasure Coding. Enterprise version is at seaweedfs.com.
- Updated
Jul 18, 2025 - Go
Ceph is a distributed object, block, and file storage platform
- Updated
Jul 18, 2025 - C++
JuiceFS is a distributed POSIX file system built on top of Redis and S3.
- Updated
Jul 17, 2025 - Go
Utils for streaming large files (S3, HDFS, gzip, bz2...)
- Updated
Jul 10, 2025 - Python
The Universal Storage Engine
- Updated
Jul 18, 2025 - C++
A fast and versatile ETL tool that can transfer data between RDBMS and NoSQL seamlessly
- Updated
Jul 18, 2025 - Java
80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.
- Updated
Apr 25, 2025 - Python
Deprecated - See Lenses.io Community Edition
- Updated
May 7, 2025 - JavaScript
CloudEon uses Kubernetes to install and deploy open-source big data components, enabling the containerized operation of an open-source big data platform. This allows you to reduce your focus on underlying resource management and maintenance.
- Updated
Mar 27, 2025 - FreeMarker
Fundamentals of Spark with Python (using PySpark), code examples
- Updated
Oct 29, 2022 - Jupyter Notebook
Improve this page
Add a description, image, and links to thehdfs topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thehdfs topic, visit your repo's landing page and select "manage topics."