Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
#

pyspark

Here are 3,914 public repositories matching this topic...

SynapseMLspark-nlp

Apache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.

  • UpdatedMar 11, 2025
  • Java

Implementing best practices for PySpark ETL jobs and applications.

  • UpdatedJan 1, 2023
  • Python

Petastorm library enables single machine or distributed training and evaluation of deep learning models from datasets in Apache Parquet format. It supports ML frameworks such as Tensorflow, Pytorch, and PySpark and can be used from pure Python code.

  • UpdatedDec 2, 2023
  • Python

A curated list of awesome Apache Spark packages and resources.

  • UpdatedOct 24, 2024
  • Shell

Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks

  • UpdatedMar 16, 2024
  • Jupyter Notebook

SQL data analysis & visualization projects using MySQL, PostgreSQL, SQLite, Tableau, Apache Spark and pySpark.

  • UpdatedJul 18, 2022
  • Jupyter Notebook

Jupyter magics and kernels for working with remote Spark clusters

  • UpdatedMar 3, 2025
  • Python

PySpark-Tutorial provides basic algorithms using PySpark

  • UpdatedJan 25, 2025
  • Jupyter Notebook

Sparkling Water provides H2O functionality inside Spark cluster

  • UpdatedNov 19, 2024
  • Scala

Lightweight and extensible compatibility layer between dataframe libraries!

  • UpdatedMar 17, 2025
  • Python

Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.

  • UpdatedDec 11, 2024
  • Vue

pyspark🍒🥭 is delicious,just eat it!😋😋

  • UpdatedSep 22, 2022
  • Python

80+ DevOps & Data CLI Tools - AWS, GCP, GCF Python Cloud Functions, Log Anonymizer, Spark, Hadoop, HBase, Hive, Impala, Linux, Docker, Spark Data Converters & Validators (Avro/Parquet/JSON/CSV/INI/XML/YAML), Travis CI, AWS CloudFormation, Elasticsearch, Solr etc.

  • UpdatedMar 14, 2025
  • Python
kuwala

Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data sc…

  • UpdatedAug 10, 2022
  • JavaScript

Improve this page

Add a description, image, and links to thepyspark topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with thepyspark topic, visit your repo's landing page and select "manage topics."

Learn more


[8]ページ先頭

©2009-2025 Movatter.jp