Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Commitbf0cb52

Browse files
authored
Use Presidio for masking (#40)
* Use Presidio for masking* Improve PII to handle encoded content* Reject large encoded content as DOS* Handle hex errors* Fix structured output masking path
1 parentd2ba595 commitbf0cb52

28 files changed

+1265
-185
lines changed

‎docs/ref/checks/competitors.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,9 @@ Returns a `GuardrailResult` with the following `info` dictionary:
3030
{
3131
"guardrail_name":"Competitor Detection",
3232
"competitors_found": ["competitor1"],
33-
"checked_competitors": ["competitor1","rival-company.com"],
34-
"checked_text":"Original input text"
33+
"checked_competitors": ["competitor1","rival-company.com"]
3534
}
3635
```
3736

3837
-**`competitors_found`**: List of competitors detected in the text
3938
-**`checked_competitors`**: List of competitors that were configured for detection
40-
-**`checked_text`**: Original input text

‎docs/ref/checks/custom_prompt_check.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
3535
"guardrail_name":"Custom Prompt Check",
3636
"flagged":true,
3737
"confidence":0.85,
38-
"threshold":0.7,
39-
"checked_text":"Original input text"
38+
"threshold":0.7
4039
}
4140
```
4241

4342
-**`flagged`**: Whether the custom validation criteria were met
4443
-**`confidence`**: Confidence score (0.0 to 1.0) for the validation
4544
-**`threshold`**: The confidence threshold that was configured
46-
-**`checked_text`**: Original input text

‎docs/ref/checks/hallucination_detection.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -113,8 +113,7 @@ Returns a `GuardrailResult` with the following `info` dictionary:
113113
"hallucination_type":"factual_error",
114114
"hallucinated_statements": ["Our premium plan costs $299/month"],
115115
"verified_statements": ["We offer customer support"],
116-
"threshold":0.7,
117-
"checked_text":"Our premium plan costs $299/month and we offer customer support"
116+
"threshold":0.7
118117
}
119118
```
120119

@@ -125,7 +124,6 @@ Returns a `GuardrailResult` with the following `info` dictionary:
125124
-**`hallucinated_statements`**: Specific statements that are contradicted or unsupported
126125
-**`verified_statements`**: Statements that are supported by your documents
127126
-**`threshold`**: The confidence threshold that was configured
128-
-**`checked_text`**: Original input text
129127

130128
Tip:`hallucination_type` is typically one of`factual_error`,`unsupported_claim`, or`none`.
131129

‎docs/ref/checks/jailbreak.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -56,15 +56,13 @@ Returns a `GuardrailResult` with the following `info` dictionary:
5656
"guardrail_name":"Jailbreak",
5757
"flagged":true,
5858
"confidence":0.85,
59-
"threshold":0.7,
60-
"checked_text":"Original input text"
59+
"threshold":0.7
6160
}
6261
```
6362

6463
-**`flagged`**: Whether a jailbreak attempt was detected
6564
-**`confidence`**: Confidence score (0.0 to 1.0) for the detection
6665
-**`threshold`**: The confidence threshold that was configured
67-
-**`checked_text`**: Original input text
6866

6967
##Related checks
7068

‎docs/ref/checks/keywords.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -25,11 +25,9 @@ Returns a `GuardrailResult` with the following `info` dictionary:
2525
{
2626
"guardrail_name":"Keyword Filter",
2727
"matched": ["confidential","secret"],
28-
"checked": ["confidential","secret","internal only"],
29-
"checked_text":"This is confidential information that should be kept secret"
28+
"checked": ["confidential","secret","internal only"]
3029
}
3130
```
3231

3332
-**`matched`**: List of keywords found in the text
3433
-**`checked`**: List of keywords that were configured for detection
35-
-**`checked_text`**: Original input text

‎docs/ref/checks/moderation.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -57,12 +57,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
5757
"violence":0.12,
5858
"self-harm":0.08,
5959
"sexual":0.03
60-
},
61-
"checked_text":"Original input text"
60+
}
6261
}
6362
```
6463

6564
-**`flagged`**: Whether any category violation was detected
6665
-**`categories`**: Boolean flags for each category indicating violations
6766
-**`category_scores`**: Confidence scores (0.0 to 1.0) for each category
68-
-**`checked_text`**: Original input text

‎docs/ref/checks/nsfw.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -44,15 +44,13 @@ Returns a `GuardrailResult` with the following `info` dictionary:
4444
"guardrail_name":"NSFW Text",
4545
"flagged":true,
4646
"confidence":0.85,
47-
"threshold":0.7,
48-
"checked_text":"Original input text"
47+
"threshold":0.7
4948
}
5049
```
5150

5251
-**`flagged`**: Whether NSFW content was detected
5352
-**`confidence`**: Confidence score (0.0 to 1.0) for the detection
5453
-**`threshold`**: The confidence threshold that was configured
55-
-**`checked_text`**: Original input text
5654

5755
###Examples
5856

‎docs/ref/checks/off_topic_prompts.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,12 +35,10 @@ Returns a `GuardrailResult` with the following `info` dictionary:
3535
"guardrail_name":"Off Topic Prompts",
3636
"flagged":false,
3737
"confidence":0.85,
38-
"threshold":0.7,
39-
"checked_text":"Original input text"
38+
"threshold":0.7
4039
}
4140
```
4241

4342
-**`flagged`**: Whether the content aligns with your business scope
4443
-**`confidence`**: Confidence score (0.0 to 1.0) for the prompt injection detection assessment
4544
-**`threshold`**: The confidence threshold that was configured
46-
-**`checked_text`**: Original input text

‎docs/ref/checks/pii.md‎

Lines changed: 45 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,33 @@
22

33
Detects personally identifiable information (PII) such as SSNs, phone numbers, credit card numbers, and email addresses using Microsoft's[Presidio library](https://microsoft.github.io/presidio/). Will automatically mask detected PII or block content based on configuration.
44

5+
**Advanced Security Features:**
6+
7+
-**Unicode normalization**: Prevents bypasses using fullwidth characters (@) or zero-width spaces
8+
-**Encoded PII detection**: Optionally detects PII hidden in Base64, URL-encoded, or hex strings
9+
-**URL context awareness**: Detects emails in query parameters (e.g.,`GET /api?user=john@example.com`)
10+
-**Custom recognizers**: Includes CVV/CVC codes and BIC/SWIFT codes beyond Presidio defaults
11+
512
##Configuration
613

714
```json
815
{
916
"name":"Contains PII",
1017
"config": {
11-
"entities": ["EMAIL_ADDRESS","US_SSN","CREDIT_CARD","PHONE_NUMBER"],
12-
"block":false
18+
"entities": ["EMAIL_ADDRESS","US_SSN","CREDIT_CARD","PHONE_NUMBER","CVV","BIC_SWIFT"],
19+
"block":false,
20+
"detect_encoded_pii":false
1321
}
1422
}
1523
```
1624

1725
###Parameters
1826

19-
-**`entities`** (required): List of PII entity types to detect. See the full list of[supported entities](https://microsoft.github.io/presidio/supported_entities/).
27+
-**`entities`** (required): List of PII entity types to detect. Includes:
28+
- Standard Presidio entities: See the full list of[supported entities](https://microsoft.github.io/presidio/supported_entities/)
29+
- Custom entities:`CVV` (credit card security codes),`BIC_SWIFT` (bank identification codes)
2030
-**`block`** (optional): Whether to block content or just mask PII (default:`false`)
31+
-**`detect_encoded_pii`** (optional): If`true`, detects PII in Base64/URL-encoded/hex strings (default:`false`)
2132

2233
##Implementation Notes
2334

@@ -41,6 +52,8 @@ Detects personally identifiable information (PII) such as SSNs, phone numbers, c
4152

4253
Returns a`GuardrailResult` with the following`info` dictionary:
4354

55+
###Basic Example (Plain PII)
56+
4457
```json
4558
{
4659
"guardrail_name":"Contains PII",
@@ -55,8 +68,34 @@ Returns a `GuardrailResult` with the following `info` dictionary:
5568
}
5669
```
5770

58-
-**`detected_entities`**: Detected entities and their values
71+
###With Encoded PII Detection Enabled
72+
73+
When`detect_encoded_pii: true`, the guardrail also detects and masks encoded PII:
74+
75+
```json
76+
{
77+
"guardrail_name":"Contains PII",
78+
"detected_entities": {
79+
"EMAIL_ADDRESS": [
80+
"user@email.com",
81+
"am9obkBleGFtcGxlLmNvbQ==",
82+
"%6a%6f%65%40domain.com",
83+
"6a6f686e406578616d706c652e636f6d"
84+
]
85+
},
86+
"entity_types_checked": ["EMAIL_ADDRESS"],
87+
"checked_text":"Contact <EMAIL_ADDRESS> or <EMAIL_ADDRESS_ENCODED> or <EMAIL_ADDRESS_ENCODED>",
88+
"block_mode":false,
89+
"pii_detected":true
90+
}
91+
```
92+
93+
Note: Encoded PII is masked with`<ENTITY_TYPE_ENCODED>` to distinguish it from plain text PII.
94+
95+
###Field Descriptions
96+
97+
-**`detected_entities`**: Detected entities and their values (includes both plain and encoded forms when`detect_encoded_pii` is enabled)
5998
-**`entity_types_checked`**: List of entity types that were configured for detection
60-
-**`checked_text`**: Text with PII masked (if PIIwas found) or original text (if noPIIwas found)
99+
-**`checked_text`**: Text with PII masked. Plain PIIuses`<ENTITY_TYPE>`, encodedPIIuses`<ENTITY_TYPE_ENCODED>`
61100
-**`block_mode`**: Whether the check was configured to block or mask
62-
-**`pii_detected`**: Boolean indicating if any PII was found
101+
-**`pii_detected`**: Boolean indicating if any PII was found (plain or encoded)

‎docs/ref/checks/prompt_injection_detection.md‎

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,8 +73,7 @@ Returns a `GuardrailResult` with the following `info` dictionary:
7373
"name":"get_weather",
7474
"arguments":"{'location': 'Tokyo'}"
7575
}
76-
],
77-
"checked_text":"[{'role': 'user', 'content': 'What is the weather in Tokyo?'}]"
76+
]
7877
}
7978
```
8079

@@ -84,7 +83,6 @@ Returns a `GuardrailResult` with the following `info` dictionary:
8483
-**`threshold`**: The confidence threshold that was configured
8584
-**`user_goal`**: The tracked user intent from conversation
8685
-**`action`**: The list of function calls or tool outputs analyzed for alignment
87-
-**`checked_text`**: Serialized conversation history inspected during analysis
8886

8987
##Benchmark Results
9088

0 commit comments

Comments
 (0)

[8]ページ先頭

©2009-2025 Movatter.jp