Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

Apache Spark

spark logo

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Here are 9,439 public repositories matching this topic...

Apache Spark - A unified analytics engine for large-scale data processing

  • UpdatedDec 18, 2025
  • Scala

Data Engineering Zoomcamp is a free 9-week course on building production-ready data pipelines. The next cohort starts in January 2026. Join the course here 👇🏼

  • UpdatedDec 16, 2025
  • Jupyter Notebook

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • UpdatedMar 20, 2024
  • Python

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

  • UpdatedDec 18, 2025
  • Python

Learn and understand Docker&Container technologies, with real DevOps practice!

  • UpdatedDec 17, 2025
  • Go

List of Data Science Cheatsheets to rule the world

  • UpdatedJul 18, 2024

GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.

  • UpdatedAug 15, 2025
  • Python
flink-learning

flink learning blog.http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

  • UpdatedMar 12, 2025
  • Java

Apache Doris is an easy-to-use, high performance and unified analytics database.

  • UpdatedDec 18, 2025
  • Java

【大厂面试专栏】一份Java程序员需要的技术指南,这里有面试题、系统架构、职场锦囊、主流中间件等,让你成为更牛的自己!

  • UpdatedJul 21, 2025

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

  • UpdatedDec 1, 2025
  • Python

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...

  • UpdatedDec 17, 2025
  • Java

专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

  • UpdatedAug 7, 2023

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

  • UpdatedDec 18, 2025
  • Scala

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • UpdatedDec 18, 2025
  • Jupyter Notebook

Alluxio, data orchestration for analytics and machine learning in the cloud

  • UpdatedApr 29, 2025
  • Java

A Flexible and Powerful Parameter Server for large-scale machine learning

  • UpdatedOct 13, 2025
  • Java

Created by Matei Zaharia

Released May 26, 2014

Followers
435 followers
Repository
apache/spark
Website
github.com/topics/spark
Wikipedia
Wikipedia

Related topics

hadoop scala

[8]ページ先頭

©2009-2025 Movatter.jp