Entity resolution
Entity resolution (also known as data matching, data linkage, record linkage, and many other terms) is the task of finding entities in a dataset that refer to the same entity across different data sources (e.g., data files, books, websites, and databases). Entity resolution is necessary when joining different data sets based on entities that may or may not share a common identifier (e.g., database key, URI, National identification number), which may be due to differences in record shape, storage location, or curator style or preference.
Here are 191 public repositories matching this topic...
Language:All
Sort:Most stars
🆔 A python library for accurate and scalable fuzzy matching, record deduplication and entity-resolution.
- Updated
Jul 29, 2025 - Python
Fast, accurate and scalable probabilistic data linkage with support for multiple SQL backends
- Updated
Dec 17, 2025 - Python
Scalable identity resolution, entity resolution, data mastering and deduplication using ML
- Updated
Dec 17, 2025 - Java
A powerful and modular toolkit for record linkage and duplicate detection in Python
- Updated
Feb 21, 2024 - Python
1 line for thousands of State of The Art NLP models in hundreds of languages The fastest and most accurate way to solve text problems.
- Updated
Jan 28, 2025 - Python
On-device Speech-to-Intent engine powered by deep learning
- Updated
Dec 17, 2025 - Python
🆔 Command line tool for deduplicating CSV files
- Updated
Mar 31, 2020 - Python
🆔 Examples for using the dedupe library
- Updated
Aug 10, 2024 - Python
A list of free data matching and record linkage software.
- Updated
Feb 21, 2024
Recent trends of Entity Linking, Disambiguation, and Representation.
- Updated
Jun 26, 2021
This repository contains code and datasets related to entity/knowledge papers from the VERT (Versatile Entity Recognition & disambiguation Toolkit) project, by the Knowledge Computing group at Microsoft Research Asia (MSRA).
- Updated
Mar 16, 2024 - Python
ReFinED is an efficient and accurate entity linking (EL) system.
- Updated
Dec 13, 2024 - Python
An open source, high scalability toolkit in Java for Entity Resolution.
- Updated
Jul 12, 2025 - Java
🔎 Finds fuzzy matches between CSV files
- Updated
Mar 26, 2025 - Python
Strwythura: construct a knowledge graph from unstructured data sources, organized by results from entity resolution, implementing an enhanced GraphRAG approach, and also implementing an ontology pipeline plus context engineering for optimizing AI application outcomes within a specific domain
- Updated
Dec 17, 2025 - Jupyter Notebook
Entity resolution for Elasticsearch.
- Updated
Dec 16, 2025 - Java
PyTorch library for transforming entities like companies, products, etc. into vectors to support scalable Record Linkage / Entity Resolution using Approximate Nearest Neighbors.
- Updated
Nov 18, 2022 - Jupyter Notebook
A convenient way to link, deduplicate, aggregate and cluster data(frames) in Python using deep learning
- Updated
Nov 5, 2025 - Python
Resources for tackling record linkage / deduplication / data matching problems
- Updated
Feb 22, 2024
OpenRefine reconciliation services for VIAF, ORCID, and Open Library + framework for creating more.
- Updated
Jun 18, 2025 - Java
Created by Halbert L. Dunn
Released 1946
- Followers
- 49 followers
- Organization
- entity-resolution
- Website
- github.com/topics/entity-resolution
- Wikipedia
- Wikipedia