data-wrangling
Here are 1,384 public repositories matching this topic...
Language:All
Sort:Most stars
OpenRefine is a free, open source power tool for working with messy data and improving it
- Updated
Nov 28, 2025 - Java
Select, put and delete data from JSON, TOML, YAML, XML and CSV files with a single tool. Supports conversion between formats and can be used as a Go package.
- Updated
Nov 17, 2025 - Go
A Collection of Cheatsheets, Books, Questions, and Portfolio For DS/ML Interview Prep
- Updated
Aug 31, 2024 - Jupyter Notebook
Blazing-fast Data-Wrangling toolkit
- Updated
Nov 29, 2025 - Rust
Carefully curated resource links for data science in one place
- Updated
Aug 17, 2024
Analytics, Versioning and ETL for multimodal data: video, audio, PDFs, images
- Updated
Nov 29, 2025 - Python
Zui is a powerful desktop application for exploring and working with data. The official front-end to the Zed lake.
- Updated
Nov 28, 2025 - TypeScript
A Python toolbox for gaining geometric insights into high-dimensional data
- Updated
Jul 10, 2025 - Python
🚚 Agile Data Preparation Workflows made easy with Pandas, Dask, cuDF, Dask-cuDF, Vaex and PySpark
- Updated
Dec 2, 2024 - Python
Machine learning with dataframes
- Updated
Nov 28, 2025 - Python
The JavaScript data transformation and analysis toolkit inspired by Pandas and LINQ.
- Updated
Apr 30, 2025 - TypeScript
Statistical Inference via Data Science: A ModernDive into R and the Tidyverse
- Updated
Nov 20, 2025 - HTML
Materials for following along with Hands-On Data Analysis with Pandas – Second Edition
- Updated
May 6, 2025 - Jupyter Notebook
Microsoft Program Synthesis using Examples SDK is a framework of technologies for the automatic generation of programs from input-output examples. This repo includes samples and sample data for the Microsoft Program Synthesis using Example SDK.
- Updated
Nov 19, 2025 - C#
Desbordante is a high-performance data profiler that is capable of discovering many different patterns in data using various algorithms. It also allows to run data cleaning scenarios using these algorithms. Desbordante has a console version and an easy-to-use web application.
- Updated
Nov 28, 2025 - C++
Materials for following along with Hands-On Data Analysis with Pandas.
- Updated
Nov 12, 2025 - Jupyter Notebook
An introductory workshop on pandas with notebooks and exercises for following along. Slides contain all solutions.
- Updated
Nov 1, 2025 - Jupyter Notebook
Data Analysis and Visualization in R for Ecologists
- Updated
Nov 25, 2025 - R
Pacote que trata e organiza os dados do Cadastro Nacional da Pessoa Jurídica (CNPJ)
- Updated
Sep 18, 2021 - R
Like awk, but with SQL and table joins
- Updated
Nov 25, 2024 - Tcl
Improve this page
Add a description, image, and links to thedata-wrangling topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedata-wrangling topic, visit your repo's landing page and select "manage topics."