- Notifications
You must be signed in to change notification settings - Fork612
Context aware, pluggable and customizable data protection and de-identification SDK for text and images
License
microsoft/presidio
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Context aware, pluggable and customizable PII de-identification service for text and images.
Presidio(Origin from Latin praesidium ‘protection, garrison’) helps to ensure sensitive data is properly managed and governed. It provides fastidentification andanonymization modules for private entities in text such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more.
💭Demo
Please help us improve by takingthis short anonymous survey.
- Allow organizations to preserve privacy in a simpler way by democratizing de-identification technologies and introducing transparency in decisions.
- Embrace extensibility and customizability to a specific business need.
- Facilitate both fully automated and semi-automated PII de-identification flows on multiple platforms.
- Predefined orcustom PII recognizers leveragingNamed Entity Recognition,regular expressions,rule based logic andchecksum with relevant context in multiple languages.
- Options for connecting to external PII detection models.
- Multiple usage options,from Python or PySpark workloads through Docker to Kubernetes.
- Customizability in PII identification and de-identification.
- Module forredacting PII text in images (standard image types and DICOM medical images).
- Getting started
- Setting up a development environment
- PII de-identification in text
- PII de-identification in images
- Usage samples and example deployments
- Before you submit an issue, please go over thedocumentation.
- For general discussions, please use theGitHub repo's discussion board.
- If you have a usage question, found a bug or have a suggestion for improvement, please file aGitHub issue.
- For other matters, please emailpresidio@microsoft.com.
For details on contributing to this repository, see thecontributing guide.
This project welcomes contributions and suggestions. Most contributions require you to agree to aContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant usthe rights to use your contribution. For details, visithttps://cla.microsoft.com.
When you submit a pull request, a CLA-bot will automatically determine whether you need to providea CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructionsprovided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted theMicrosoft Open Source Code of Conduct.For more information see theCode of Conduct FAQ orcontactopencode@microsoft.com with any additional questions or comments.
About
Context aware, pluggable and customizable data protection and de-identification SDK for text and images