dataquality
Here are 81 public repositories matching this topic...
Language:All
Sort:Most stars
Always know what to expect from your data.
- Updated
Mar 14, 2025 - Python
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
- Updated
Mar 12, 2025 - Python
OpenMetadata is a unified metadata platform for data discovery, data observability, and data governance powered by a central metadata repository, in-depth column level lineage, and seamless team collaboration.
- Updated
Mar 17, 2025 - TypeScript
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
- Updated
Mar 5, 2025 - Scala
Compare tables within or across databases
- Updated
May 17, 2024 - Python
⚡ Data quality testing for the modern data stack (SQL, Spark, and Pandas)https://www.soda.io
- Updated
Mar 16, 2025 - Python
re_data - fix data issues before your users & CEO would discover them 😊
- Updated
Apr 30, 2024 - HTML
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
- Updated
Mar 16, 2025 - Java
ML powered analytics engine for outlier detection and root cause analysis.
- Updated
Sep 12, 2024 - Python
The premier open source Data Quality solution
- Updated
Nov 16, 2024 - Java
Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
- Updated
Feb 20, 2025 - Java
Library for Semi-Automated Data Science
- Updated
Sep 16, 2024 - Python
Possibly the fastest DataFrame-agnostic quality check library in town.
- Updated
Mar 10, 2025 - Python
Open Source Data Quality Monitoring.
- Updated
Mar 11, 2025 - Python
Find data quality issues and clean your data in a single line of code with a Scikit-Learn compatible Transformer.
- Updated
Dec 13, 2023 - Python
Frontend for the osmcha-django REST API
- Updated
Feb 24, 2025 - JavaScript
Installer for DataKitchen's Open Source Data Observability Products. Data breaks. Servers break. Your toolchain breaks. Ensure your team is the first to know and the first to solve with visibility across and down your data estate. Save time with simple, fast data quality test generation and execution. Trust your data, tools, and systems end to end.
- Updated
Mar 14, 2025 - Python
Dingo: A Comprehensive Data Quality Evaluation Tool
- Updated
Mar 17, 2025 - JavaScript
DataOps Data Quality TestGen is part of DataKitchen's Open Source Data Observability. DataOps TestGen delivers simple, fast data quality test generation and execution by data profiling, new dataset hygiene review, AI generation of data quality validation tests, ongoing testing of data refreshes, & continuous anomaly monitoring
- Updated
Mar 7, 2025 - Python
Improve this page
Add a description, image, and links to thedataquality topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedataquality topic, visit your repo's landing page and select "manage topics."