dataframes-api
Here are 14 public repositories matching this topic...
Sort:Most stars
Cylon is a fast, scalable, distributed memory, parallel runtime with a Pandas like DataFrame.
- Updated
Feb 4, 2026 - C++
TPC-H queries in Apache Spark SQL using native DataFrames API
- Updated
Jan 24, 2024 - C
Java Application, uses Apache Spark, handles batch as well as streaming processing
- Updated
Jan 22, 2021 - Java
mainframe - a lightweight dataframe library for C++
- Updated
Jun 3, 2023 - C++
Apache Spark project for Advanced Topics on Databases course
- Updated
Mar 19, 2021 - Python
Python Skills Checkpoint
- Updated
Oct 14, 2025 - Jupyter Notebook
API converting NYC Department of Health:https://github.com/nychealth/coronavirus-data
- Updated
Dec 8, 2022 - Python
Construct Source files as per the target files in Spark using Datframe api and spark
- Updated
Oct 19, 2021 - Scala
A sandbox environment designed to simulate a pseudo-distributed Hadoop cluster with integrated Apache Spark and Kafka components. It allows developers to prototype and experiment with big data workflows, test distributed computing patterns, and explore cluster behavior in a contained virtual setup.
- Updated
Nov 15, 2025 - Java
Semester assignment for ECE NTUA 3189 Advanced Topics in Database Systems
- Updated
Feb 5, 2023 - Scala
Soundhopper project - created for users to skip ahead to specified sections of track - built using Python, and Jupyter notebook.
- Updated
Jan 21, 2021 - Jupyter Notebook
make easier the use of columnar spark files
- Updated
Jan 2, 2018 - Scala
Analysis of American Time Use Survey (ATUS):https://www.kaggle.com/bls/american-time-use-survey
- Updated
May 7, 2017 - Scala
Improve this page
Add a description, image, and links to thedataframes-api topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedataframes-api topic, visit your repo's landing page and select "manage topics."