Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
#

Apache Spark

spark logo

Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.

Here are 9,120 public repositories matching this topic...

Apache Spark - A unified analytics engine for large-scale data processing

  • UpdatedJul 18, 2025
  • Scala

Data Engineering Zoomcamp is a free nine-week course that covers the fundamentals of data engineering.

  • UpdatedJul 17, 2025
  • Jupyter Notebook

Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.

  • UpdatedMar 20, 2024
  • Python

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

  • UpdatedJul 17, 2025
  • Python

Learn and understand Docker&Container technologies, with real DevOps practice!

  • UpdatedDec 26, 2024
  • Go

List of Data Science Cheatsheets to rule the world

  • UpdatedJul 18, 2024

GUI for ChatGPT API and many LLMs. Supports agents, file-based QA, GPT finetuning and query with web search. All with a neat UI.

  • UpdatedMar 13, 2025
  • Python
flink-learning

flink learning blog.http://www.54tianzhisheng.cn/ 含 Flink 入门、概念、原理、实战、性能调优、源码解析等内容。涉及 Flink Connector、Metrics、Library、DataStream API、Table API & SQL 等内容的学习案例,还有 Flink 落地应用的大型项目案例(PVUV、日志存储、百亿数据实时去重、监控告警)分享。欢迎大家支持我的专栏《大数据实时计算引擎 Flink 实战与性能优化》

  • UpdatedMar 12, 2025
  • Java

Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

  • UpdatedJul 1, 2025
  • Python

【大厂面试专栏】一份Java程序员需要的技术指南,这里有面试题、系统架构、职场锦囊、主流中间件等,让你成为更牛的自己!

  • UpdatedOct 28, 2023

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running math code and a java based math library on top of the core c++ library. Also includes samediff: a pytorch/tensorflow like library for running deep learn...

  • UpdatedJul 18, 2025
  • Java

Apache Doris is an easy-to-use, high performance and unified analytics database.

  • UpdatedJul 18, 2025
  • Groovy

专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

  • UpdatedAug 7, 2023

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

  • UpdatedJul 18, 2025
  • Scala

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.

  • UpdatedJul 18, 2025
  • Jupyter Notebook

Alluxio, data orchestration for analytics and machine learning in the cloud

  • UpdatedApr 29, 2025
  • Java

A Flexible and Powerful Parameter Server for large-scale machine learning

  • UpdatedJul 8, 2025
  • Java

Created by Matei Zaharia

Released May 26, 2014

Followers
428 followers
Repository
apache/spark
Website
github.com/topics/spark
Wikipedia
Wikipedia

Related Topics

hadoop scala

[8]ページ先頭

©2009-2025 Movatter.jp