data-centric-machine-learning
Here are 14 public repositories matching this topic...
Sort:Most stars
A Doctor for your data
- Updated
Jan 14, 2025 - Python
A curated, but incomplete, list of data-centric AI resources.
- Updated
Jun 26, 2024
Pytorch implementation of DoReMi, a method for optimizing the data mixture weights in language modeling datasets
- Updated
Dec 26, 2023 - HTML
Contains implementations of data-centric approaches for improving semantic segmentation on satellite imagery.
- Updated
Apr 10, 2025 - Python
A list of data-efficient and data-centric LLM (Large Language Model) papers. Our Survey Paper: Towards Efficient LLM Post Training: A Data-centric Perspective
- Updated
Feb 19, 2025
Data-IQ: Characterizing subgroups with heterogeneous outcomes in tabular data (NeurIPS 2022)
- Updated
Mar 20, 2023 - Jupyter Notebook
(Pattern Recognition 2025) Towards Trustworthy Dataset Distillation
- Updated
Dec 8, 2024 - Python
Enhancing Efficiency in Multidevice Federated Learning through Data Selection
- Updated
Apr 15, 2024 - Python
TRIAGE: Characterizing and auditing training data for improved regression (NeurIPS 2023)
- Updated
Mar 14, 2024 - Jupyter Notebook
Data-SUITE: Data-centric identification of in-distribution incongruous examples (ICML 2022)
- Updated
Mar 8, 2023 - Jupyter Notebook
You can’t handle the (dirty) truth: Data-centric insights improve pseudo-labeling
- Updated
Jun 30, 2024 - Jupyter Notebook
A multi-view panorama of Data-Centric AI: Techniques, Tools, and Applications (ECAI Tutorial 2024)
- Updated
Oct 19, 2024
Implementation of data typology for imbalanced datasets.
- Updated
Jun 4, 2023 - MATLAB
Data Clustering using Expectation Maximization algorithm. To cite this Original Software Publication:https://www.sciencedirect.com/science/article/pii/S2352711021001771
- Updated
Oct 25, 2021 - R
Improve this page
Add a description, image, and links to thedata-centric-machine-learning topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with thedata-centric-machine-learning topic, visit your repo's landing page and select "manage topics."