Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Context aware, pluggable and customizable data protection and de-identification SDK for text and images

License

NotificationsYou must be signed in to change notification settings

microsoft/presidio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Context aware, pluggable and customizable PII de-identification service for text and images.


Build StatusMIT licenseReleaseCII Best PracticesPyPI pyversions

  • Presidio AnalyzerPypi Downloads
  • Presidio AnonymizerPypi Downloads
  • Presidio Image-RedactorPypi Downloads
  • Presidio StructuredPypi Downloads

What is Presidio

Presidio(Origin from Latin praesidium ‘protection, garrison’) helps to ensure sensitive data is properly managed and governed. It provides fastidentification andanonymization modules for private entities in text such as credit card numbers, names, locations, social security numbers, bitcoin wallets, US phone numbers, financial data and more.

Presidio demo gif


💭Demo


Are you using Presidio? We'd love to know how

Please help us improve by takingthis short anonymous survey.


Goals

  • Allow organizations to preserve privacy in a simpler way by democratizing de-identification technologies and introducing transparency in decisions.
  • Embrace extensibility and customizability to a specific business need.
  • Facilitate both fully automated and semi-automated PII de-identification flows on multiple platforms.

Main features

  1. Predefined orcustom PII recognizers leveragingNamed Entity Recognition,regular expressions,rule based logic andchecksum with relevant context in multiple languages.
  2. Options for connecting to external PII detection models.
  3. Multiple usage options,from Python or PySpark workloads through Docker to Kubernetes.
  4. Customizability in PII identification and de-identification.
  5. Module forredacting PII text in images (standard image types and DICOM medical images).

⚠️ Presidio can help identify sensitive/PII data in un/structured text. However, because it is using automated detection mechanisms, there is no guarantee that Presidio will find all sensitive information. Consequently, additional systems and protections should be employed.

Installing Presidio

  1. Using pip
  2. Using Docker
  3. From source
  4. Migrating from V1 to V2

Running Presidio

  1. Getting started
  2. Setting up a development environment
  3. PII de-identification in text
  4. PII de-identification in images
  5. Usage samples and example deployments

Support

Contributing

For details on contributing to this repository, see thecontributing guide.

This project welcomes contributions and suggestions. Most contributions require you to agree to aContributor License Agreement (CLA) declaring that you have the right to, and actually do, grant usthe rights to use your contribution. For details, visithttps://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need to providea CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the instructionsprovided by the bot. You will only need to do this once across all repos using our CLA.

This project has adopted theMicrosoft Open Source Code of Conduct.For more information see theCode of Conduct FAQ orcontactopencode@microsoft.com with any additional questions or comments.

Contributors


[8]ページ先頭

©2009-2025 Movatter.jp