Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Dev/steven/remove anonymizer#47

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
gabor-openai merged 5 commits intomainfromdev/steven/remove_anonymizer
Nov 10, 2025

Conversation

@steven10a
Copy link
Collaborator

@steven10asteven10a commentedNov 10, 2025
edited
Loading

Remove dependency onpresidio-anonymizer to prevent package clash

  • presidio-anonymizer is being used to mask detected entities. This PR implements that functionality ourselves without depending on an external library
  • Additionally, updatesBIC_SWIFT detection which was resulting in too many false positives
  • Updated and added relevant tests

Copy link

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull Request Overview

This PR removes the dependency onpresidio-anonymizer and implements custom text masking functionality to prevent package conflicts. The changes include a new lightweight anonymizer module, improved BIC/SWIFT detection patterns to reduce false positives, and comprehensive test coverage for both the anonymizer and BIC detection.

Key changes:

  • Implemented custom anonymizer insrc/guardrails/utils/anonymizer.py to replacepresidio-anonymizer
  • Enhanced BIC/SWIFT detection with context-aware patterns and bank code whitelisting to reduce false positives
  • Added baseline tests to ensure the custom anonymizer matches expected behavior

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
FileDescription
src/guardrails/utils/anonymizer.pyNew module implementing custom PII masking functionality with overlap resolution
src/guardrails/checks/text/pii.pyUpdated to use custom anonymizer and improved BIC/SWIFT detection patterns
src/guardrails/_base_client.pyMigrated frompresidio-anonymizer to custom anonymizer for structured content masking
tests/unit/checks/test_pii.pyAdded tests for BIC/SWIFT detection including false positive prevention
tests/unit/checks/test_anonymizer_baseline.pyNew baseline tests ensuring custom anonymizer produces expected results
pyproject.tomlRemovedpresidio-anonymizer dependency

💡Add Copilot custom instructions for smarter, more guided reviews.Learn how to get started.

Copy link

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.


💡Add Copilot custom instructions for smarter, more guided reviews.Learn how to get started.

Copy link

CopilotAI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Pull Request Overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.


💡Add Copilot custom instructions for smarter, more guided reviews.Learn how to get started.

"ANZB|NATA|WPAC|CTBA|"# Australia
"BKCH|MHCB|BOTK|"# Japan
"ICBK|ABOC|PCBC|"# China
"HSBC|SCBL|"# Hong Kong

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

"HSBC" is duplicated in the known_bank_codes string. It already appears in line 176 as part of the "Major international" banks. Consider removing the duplicate "HSBC" from the Hong Kong section to avoid redundancy.

Suggested change
"HSBC|SCBL|"# Hong Kong
"SCBL|"# Hong Kong

Copilot uses AI. Check for mistakes.
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

[nit]

Comment on lines +162 to +165
# Check that at least the first format is detected
assert"<PHONE_NUMBER>"inresult1.info["checked_text"]# noqa: S101


Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thetexts_and_results list is created and appended to but never used. Consider removing this unused variable or adding assertions that use it.

Suggested change
# Check that at least the first format is detected
assert"<PHONE_NUMBER>"inresult1.info["checked_text"]# noqa: S101
# Check that all formats are detected
fororiginal,maskedintexts_and_results:
assert"<PHONE_NUMBER>"inmasked# noqa: S101

Copilot uses AI. Check for mistakes.
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is a fine unit test for now

"ICBK|ABOC|PCBC|"# China
"HSBC|SCBL|"# Hong Kong
"DBSS|OCBC|UOVB|"# Singapore
"CZNB|SHBK|KOEX|HVBK|NACF|IBKO|KODB|HNBN|CITI"# South Korea

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

"CITI" is duplicated in the known_bank_codes string. It already appears in line 176 as part of the "Major international" banks. Consider removing the duplicate "CITI" from the South Korea section to avoid redundancy.

Suggested change
"CZNB|SHBK|KOEX|HVBK|NACF|IBKO|KODB|HNBN|CITI"# South Korea
"CZNB|SHBK|KOEX|HVBK|NACF|IBKO|KODB|HNBN"# South Korea

Copilot uses AI. Check for mistakes.
Copy link
CollaboratorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

[nit]

@gabor-openaigabor-openai merged commitbf65130 intomainNov 10, 2025
9 checks passed
@gabor-openaigabor-openai deleted the dev/steven/remove_anonymizer branchNovember 10, 2025 20:30
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

Copilot code reviewCopilotCopilot left review comments

@gabor-openaigabor-openaigabor-openai approved these changes

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@steven10a@gabor-openai

[8]ページ先頭

©2009-2025 Movatter.jp