Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

apache-hadoop

Here are 93 public repositories matching this topic...

hadoop-cos(CosN文件系统)为Apache Hadoop、Spark以及Tez等大数据计算框架集成提供支持,可以像访问HDFS一样读写存储在腾讯云COS上的数据。同时也支持作为Druid等查询与分析引擎的Deep Storage

  • UpdatedDec 4, 2025
  • Java

HADOOP 3.1.0 winutils

  • UpdatedApr 12, 2018
  • Batchfile

Export Hadoop YARN (resource-manager) metrics in prometheus format

  • UpdatedApr 15, 2025
  • Go

Containerized Apache Hive Metastore for horizontally scalable Hive Metastore deployments

  • UpdatedJan 31, 2022
  • Dockerfile

A Spark application to merge small files on Hadoop

  • UpdatedSep 7, 2020
  • Scala

This repository provides a guide to preprocess and analyze the network intrusion data set using NumPy, Pandas, and matplotlib, and implement a random forest classifier machine learning model using Scikit-learn.

  • UpdatedMay 8, 2024
  • Jupyter Notebook

Some simple, kinda introductory projects based on Apache Hadoop to be used as guides in order to make the MapReduce model look less weird or boring.

  • UpdatedMay 22, 2024
  • Java
spark-minimal-algorithms

An python implementation of Minimal Mapreduce Algorithms for Apache Spark

  • UpdatedJun 22, 2020
  • Python

A fast, scalable and distributed community detection algorithm based on CEIL scoring function.

  • UpdatedJan 1, 2019
  • Scala

The implementation of Apache Spark (combine with PySpark, Jupyter Notebook) on top of Hadoop cluster using Docker

  • UpdatedMay 10, 2024
  • Shell

This repository showcases a Medallion Architecture Data Lakehouse designed for both batch and real-time processing of e-commerce and marketing data. It supports comprehensive data analysis, reporting, and monitoring, providing a scalable solution for deriving insights from integrated datasets.

  • UpdatedSep 26, 2024
  • Jupyter Notebook

Set of Input Formats for Hadoop Streaming

  • UpdatedSep 25, 2024
  • Java

Kubernetes operator for managing the lifecycle of Apache Hadoop Yarn Tasks on Kubernetes.

  • UpdatedJan 19, 2024
  • Go

Apache Hadoop Cluster configuration with original apache/hadoop:3.4.1 docker image (with YARN)

  • UpdatedJun 30, 2025
  • Shell

An email spam filter using Apache Spark’s ML library

  • UpdatedApr 14, 2021
  • Python

Improve this page

Add a description, image, and links to theapache-hadoop topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with theapache-hadoop topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp