Movatterモバイル変換

Skip to content

openai/openai-guardrails-pythonPublic

NotificationsYou must be signed in to change notification settings
Fork17
Star123

Commitbf0cb52

steven10a

authored

Use Presidio for masking (#40)

* Use Presidio for masking* Improve PII to handle encoded content* Reject large encoded content as DOS* Handle hex errors* Fix structured output masking path

1 parentd2ba595 commitbf0cb52Copy full SHA for bf0cb52

File tree

28 files changed

+1265

-185

lines changed

docs/ref/checks
examples/basic
- pii_mask_example.py
pyproject.toml
src/guardrails
- _base_client.py
- checks/text
- utils
  - safety_identifier.py
tests/unit

28 files changed

+1265

-185

lines changed

`‎docs/ref/checks/competitors.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -30,11 +30,9 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`30`	`30`	`{`
`31`	`31`	`"guardrail_name":"Competitor Detection",`
`32`	`32`	`"competitors_found": ["competitor1"],`
`33`		`-"checked_competitors": ["competitor1","rival-company.com"],`
`34`		`-"checked_text":"Original input text"`
	`33`	`+"checked_competitors": ["competitor1","rival-company.com"]`
`35`	`34`	`}`
`36`	`35`	```
`37`	`36`
`38`	`37`	-`competitors_found`: List of competitors detected in the text
`39`	`38`	-`checked_competitors`: List of competitors that were configured for detection
`40`		--`checked_text`: Original input text

`‎docs/ref/checks/custom_prompt_check.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -35,12 +35,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`35`	`35`	`"guardrail_name":"Custom Prompt Check",`
`36`	`36`	`"flagged":true,`
`37`	`37`	`"confidence":0.85,`
`38`		`-"threshold":0.7,`
`39`		`-"checked_text":"Original input text"`
	`38`	`+"threshold":0.7`
`40`	`39`	`}`
`41`	`40`	```
`42`	`41`
`43`	`42`	-`flagged`: Whether the custom validation criteria were met
`44`	`43`	-`confidence`: Confidence score (0.0 to 1.0) for the validation
`45`	`44`	-`threshold`: The confidence threshold that was configured
`46`		--`checked_text`: Original input text

`‎docs/ref/checks/hallucination_detection.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -113,8 +113,7 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`113`	`113`	`"hallucination_type":"factual_error",`
`114`	`114`	`"hallucinated_statements": ["Our premium plan costs $299/month"],`
`115`	`115`	`"verified_statements": ["We offer customer support"],`
`116`		`-"threshold":0.7,`
`117`		`-"checked_text":"Our premium plan costs $299/month and we offer customer support"`
	`116`	`+"threshold":0.7`
`118`	`117`	`}`
`119`	`118`	```
`120`	`119`
@@ -125,7 +124,6 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`125`	`124`	-`hallucinated_statements`: Specific statements that are contradicted or unsupported
`126`	`125`	-`verified_statements`: Statements that are supported by your documents
`127`	`126`	-`threshold`: The confidence threshold that was configured
`128`		--`checked_text`: Original input text
`129`	`127`
`130`	`128`	Tip:`hallucination_type` is typically one of`factual_error`,`unsupported_claim`, or`none`.
`131`	`129`

`‎docs/ref/checks/jailbreak.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -56,15 +56,13 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`56`	`56`	`"guardrail_name":"Jailbreak",`
`57`	`57`	`"flagged":true,`
`58`	`58`	`"confidence":0.85,`
`59`		`-"threshold":0.7,`
`60`		`-"checked_text":"Original input text"`
	`59`	`+"threshold":0.7`
`61`	`60`	`}`
`62`	`61`	```
`63`	`62`
`64`	`63`	-`flagged`: Whether a jailbreak attempt was detected
`65`	`64`	-`confidence`: Confidence score (0.0 to 1.0) for the detection
`66`	`65`	-`threshold`: The confidence threshold that was configured
`67`		--`checked_text`: Original input text
`68`	`66`
`69`	`67`	`##Related checks`
`70`	`68`

`‎docs/ref/checks/keywords.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -25,11 +25,9 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`25`	`25`	`{`
`26`	`26`	`"guardrail_name":"Keyword Filter",`
`27`	`27`	`"matched": ["confidential","secret"],`
`28`		`-"checked": ["confidential","secret","internal only"],`
`29`		`-"checked_text":"This is confidential information that should be kept secret"`
	`28`	`+"checked": ["confidential","secret","internal only"]`
`30`	`29`	`}`
`31`	`30`	```
`32`	`31`
`33`	`32`	-`matched`: List of keywords found in the text
`34`	`33`	-`checked`: List of keywords that were configured for detection
`35`		--`checked_text`: Original input text

`‎docs/ref/checks/moderation.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -57,12 +57,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`57`	`57`	`"violence":0.12,`
`58`	`58`	`"self-harm":0.08,`
`59`	`59`	`"sexual":0.03`
`60`		`- },`
`61`		`-"checked_text":"Original input text"`
	`60`	`+ }`
`62`	`61`	`}`
`63`	`62`	```
`64`	`63`
`65`	`64`	-`flagged`: Whether any category violation was detected
`66`	`65`	-`categories`: Boolean flags for each category indicating violations
`67`	`66`	-`category_scores`: Confidence scores (0.0 to 1.0) for each category
`68`		--`checked_text`: Original input text

`‎docs/ref/checks/nsfw.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -44,15 +44,13 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`44`	`44`	`"guardrail_name":"NSFW Text",`
`45`	`45`	`"flagged":true,`
`46`	`46`	`"confidence":0.85,`
`47`		`-"threshold":0.7,`
`48`		`-"checked_text":"Original input text"`
	`47`	`+"threshold":0.7`
`49`	`48`	`}`
`50`	`49`	```
`51`	`50`
`52`	`51`	-`flagged`: Whether NSFW content was detected
`53`	`52`	-`confidence`: Confidence score (0.0 to 1.0) for the detection
`54`	`53`	-`threshold`: The confidence threshold that was configured
`55`		--`checked_text`: Original input text
`56`	`54`
`57`	`55`	`###Examples`
`58`	`56`

`‎docs/ref/checks/off_topic_prompts.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -35,12 +35,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`35`	`35`	`"guardrail_name":"Off Topic Prompts",`
`36`	`36`	`"flagged":false,`
`37`	`37`	`"confidence":0.85,`
`38`		`-"threshold":0.7,`
`39`		`-"checked_text":"Original input text"`
	`38`	`+"threshold":0.7`
`40`	`39`	`}`
`41`	`40`	```
`42`	`41`
`43`	`42`	-`flagged`: Whether the content aligns with your business scope
`44`	`43`	-`confidence`: Confidence score (0.0 to 1.0) for the prompt injection detection assessment
`45`	`44`	-`threshold`: The confidence threshold that was configured
`46`		--`checked_text`: Original input text

`‎docs/ref/checks/pii.md‎`

Lines changed: 45 additions & 6 deletions

Original file line number	Diff line number	Diff line change
`@@ -2,22 +2,33 @@`
`2`	`2`
`3`	`3`	`Detects personally identifiable information (PII) such as SSNs, phone numbers, credit card numbers, and email addresses using Microsoft's[Presidio library](https://microsoft.github.io/presidio/). Will automatically mask detected PII or block content based on configuration.`
`4`	`4`
	`5`	`+Advanced Security Features:`
	`6`	`+`
	`7`	`+-Unicode normalization: Prevents bypasses using fullwidth characters (＠) or zero-width spaces`
	`8`	`+-Encoded PII detection: Optionally detects PII hidden in Base64, URL-encoded, or hex strings`
	`9`	+-URL context awareness: Detects emails in query parameters (e.g.,`GET /api?user=john@example.com`)
	`10`	`+-Custom recognizers: Includes CVV/CVC codes and BIC/SWIFT codes beyond Presidio defaults`
	`11`	`+`
`5`	`12`	`##Configuration`
`6`	`13`
`7`	`14`	```json
`8`	`15`	`{`
`9`	`16`	`"name":"Contains PII",`
`10`	`17`	`"config": {`
`11`		`-"entities": ["EMAIL_ADDRESS","US_SSN","CREDIT_CARD","PHONE_NUMBER"],`
`12`		`-"block":false`
	`18`	`+"entities": ["EMAIL_ADDRESS","US_SSN","CREDIT_CARD","PHONE_NUMBER","CVV","BIC_SWIFT"],`
	`19`	`+"block":false,`
	`20`	`+"detect_encoded_pii":false`
`13`	`21`	`}`
`14`	`22`	`}`
`15`	`23`	```
`16`	`24`
`17`	`25`	`###Parameters`
`18`	`26`
`19`		--`entities` (required): List of PII entity types to detect. See the full list of[supported entities](https://microsoft.github.io/presidio/supported_entities/).
	`27`	+-`entities` (required): List of PII entity types to detect. Includes:
	`28`	`+- Standard Presidio entities: See the full list of[supported entities](https://microsoft.github.io/presidio/supported_entities/)`
	`29`	+- Custom entities:`CVV` (credit card security codes),`BIC_SWIFT` (bank identification codes)
`20`	`30`	-`block` (optional): Whether to block content or just mask PII (default:`false`)
	`31`	+-`detect_encoded_pii` (optional): If`true`, detects PII in Base64/URL-encoded/hex strings (default:`false`)
`21`	`32`
`22`	`33`	`##Implementation Notes`
`23`	`34`
`@@ -41,6 +52,8 @@ Detects personally identifiable information (PII) such as SSNs, phone numbers, c`
`41`	`52`
`42`	`53`	Returns a`GuardrailResult` with the following`info` dictionary:
`43`	`54`
	`55`	`+###Basic Example (Plain PII)`
	`56`	`+`
`44`	`57`	```json
`45`	`58`	`{`
`46`	`59`	`"guardrail_name":"Contains PII",`
@@ -55,8 +68,34 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`55`	`68`	`}`
`56`	`69`	```
`57`	`70`
`58`		--`detected_entities`: Detected entities and their values
	`71`	`+###With Encoded PII Detection Enabled`
	`72`	`+`
	`73`	+When`detect_encoded_pii: true`, the guardrail also detects and masks encoded PII:
	`74`	`+`
	`75`	+```json
	`76`	`+{`
	`77`	`+"guardrail_name":"Contains PII",`
	`78`	`+"detected_entities": {`
	`79`	`+"EMAIL_ADDRESS": [`
	`80`	`+"user@email.com",`
	`81`	`+"am9obkBleGFtcGxlLmNvbQ==",`
	`82`	`+"%6a%6f%65%40domain.com",`
	`83`	`+"6a6f686e406578616d706c652e636f6d"`
	`84`	`+ ]`
	`85`	`+ },`
	`86`	`+"entity_types_checked": ["EMAIL_ADDRESS"],`
	`87`	`+"checked_text":"Contact <EMAIL_ADDRESS> or <EMAIL_ADDRESS_ENCODED> or <EMAIL_ADDRESS_ENCODED>",`
	`88`	`+"block_mode":false,`
	`89`	`+"pii_detected":true`
	`90`	`+}`
	`91`	+```
	`92`	`+`
	`93`	+Note: Encoded PII is masked with`<ENTITY_TYPE_ENCODED>` to distinguish it from plain text PII.
	`94`	`+`
	`95`	`+###Field Descriptions`
	`96`	`+`
	`97`	+-`detected_entities`: Detected entities and their values (includes both plain and encoded forms when`detect_encoded_pii` is enabled)
`59`	`98`	-`entity_types_checked`: List of entity types that were configured for detection
`60`		--`checked_text`: Text with PII masked (if PIIwas found) or original text (if noPIIwas found)
	`99`	+-`checked_text`: Text with PII masked. Plain PIIuses`<ENTITY_TYPE>`, encodedPIIuses`<ENTITY_TYPE_ENCODED>`
`61`	`100`	-`block_mode`: Whether the check was configured to block or mask
`62`		--`pii_detected`: Boolean indicating if any PII was found
	`101`	+-`pii_detected`: Boolean indicating if any PII was found (plain or encoded)

`‎docs/ref/checks/prompt_injection_detection.md‎`

Lines changed: 1 addition & 3 deletions

Original file line number	Diff line number	Diff line change
@@ -73,8 +73,7 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`73`	`73`	`"name":"get_weather",`
`74`	`74`	`"arguments":"{'location': 'Tokyo'}"`
`75`	`75`	`}`
`76`		`- ],`
`77`		`-"checked_text":"[{'role': 'user', 'content': 'What is the weather in Tokyo?'}]"`
	`76`	`+ ]`
`78`	`77`	`}`
`79`	`78`	```
`80`	`79`
@@ -84,7 +83,6 @@ Returns a `GuardrailResult` with the following `info` dictionary:
`84`	`83`	-`threshold`: The confidence threshold that was configured
`85`	`84`	-`user_goal`: The tracked user intent from conversation
`86`	`85`	-`action`: The list of function calls or tool outputs analyzed for alignment
`87`		--`checked_text`: Serialized conversation history inspected during analysis
`88`	`86`
`89`	`87`	`##Benchmark Results`
`90`	`88`

0 commit comments

Comments

(0)

[8]ページ先頭

©2009-2025 Movatter.jp