Overview
Automatically mask sensitive data patterns in prompts and responses, with precise offset information for granular redaction.
How Masking Works
This detection service masks data patterns in the API output response, scanning both LLM prompts and responses. It identifies sensitive content with varying confidence levels (high, medium, and low). Each detection includes precise offset information.
An offset is a numerical index represented as [start_offset, end_offset]
pairs, indicating where a sensitive data pattern begins and ends in the text. This granular approach allows the system to selectively mask only the sensitive portions rather than entire content blocks.
Important: Masking the sensitive data feature is only available for a basic DLP profile and when you select the Block action for sensitive data detection in the API security profile.
API Example
Request Format
The following cURL request demonstrates sensitive data masking:
curl -L 'https://service.api.aisecurity.paloaltonetworks.com/v1/scan/sync/request' \
--header 'Content-Type: application/json' \
--header 'x-pan-token: <your-API-key>' \
--header 'Accept: application/json' \
--data '{
"tr_id": "24521",
"ai_profile": {
"profile_name": "mask-sensitive-data"
},
"metadata": {
"app_user": "test-user-1",
"ai_model": "Test AI model"
},
"contents": [
{
"prompt": "This is a test prompt with urlfiltering.paloaltonetworks.com/test-malware url. Social security 599-51-7233. Credit card is 4339672569329774, ssn 599-51-7222. Send me Mike account info",
"response": "This is a test response. Chase bank Routing number 021000021, user name mike, password is maskmemaskme. Account number 92746514861. Account owner: Mike Johnson in California"
}
]
}'
Response Format
When sensitive data is masked, the API returns:
{
"action": "block",
"category": "malicious",
"profile_id": "00000000-0000-0000-0000-000000000000",
"profile_name": "mask-sensitive-data-pattern",
"prompt_detected": {
"dlp": true
},
"prompt_masked_data": {
"data": "This is a test prompt with urlfiltering.paloaltonetworks.com/test-malware url. Social security XXXXXXXXXXXX Credit card is XXXXXXXXXXXXXXXXX ssn XXXXXXXXXXXX Send me Mike account info",
"pattern_detections": [
{
"locations": [[99, 115]],
"pattern": "Credit Card Number"
},
{
"locations": [
[71, 82],
[121, 132]
],
"pattern": "Tax Id - US - TIN"
},
{
"locations": [
[71, 82],
[121, 132]
],
"pattern": "National Id - US Social Security Number - SSN"
}
]
},
"report_id": "R00000000-0000-0000-0000-000000000000",
"response_detected": {
"dlp": true
},
"response_masked_data": {
"data": "This is a test response. Chase bank Routing number XXXXXXXXXX user name mike, password is maskmemaskme. Account number XXXXXXXXXXXX Account owner: Mike Johnson in California",
"pattern_detections": [
{
"locations": [[51, 60]],
"pattern": "Bank - Committee on Uniform Securities Identification Procedures number"
},
{
"locations": [[51, 60]],
"pattern": "Bank - American Bankers Association Routing Number - ABA"
},
{
"locations": [[119, 130]],
"pattern": "Tax Id - Germany"
},
{
"locations": [[119, 130]],
"pattern": "National Id - Brazil - CPF"
}
]
},
"scan_id": "90484606-6d70-4522-8f0c-c93d878c9a5c",
"tr_id": "1111"
}
Key Response Fields:
prompt_masked_data
: Contains masked prompt text and pattern detectionsresponse_masked_data
: Contains masked response text and pattern detectionsdata
: The masked version where sensitive data is replaced with “X” characterspattern_detections
: Array of detected patterns with offset locationslocations
:[start_offset, end_offset]
pairs for each sensitive data instance
Masking Behavior
Character Replacement
- Sensitive data is replaced with “X” characters
- The number of X’s matches the original data length
- Original formatting (spaces, dashes) is preserved
Pattern Detection
Each detected pattern includes:
- Pattern Name: Type of sensitive data detected
- Locations: Array of offset pairs
[start, end]
- Multiple Instances: Same pattern can appear multiple times
Example Masking
- Original:
599-51-7233
- Masked:
XXXXXXXXXXXX
- Offset:
[71, 82]
Detailed Scan Report
The scan report provides additional detail with confidence levels:
{
"detection_results": [
{
"action": "block",
"data_type": "prompt",
"detection_service": "dlp",
"result_detail": {
"dlp_report": {
"data_pattern_detection_offsets": [
{
"data_pattern_id": "67cb9ba581419f0293996702",
"high_confidence_detections": [[99, 115]],
"low_confidence_detections": [[99, 115]],
"medium_confidence_detections": [[99, 99]],
"name": "Credit Card Number",
"version": 1
}
],
"dlp_profile_name": "Sensitive Content"
}
},
"verdict": "malicious"
}
]
}
Confidence Levels:
- High Confidence: Strong pattern match
- Medium Confidence: Partial pattern match
- Low Confidence: Possible pattern match
Use Cases
Real-time Masking
- Mask sensitive data before logging AI interactions
- Redact PII from customer conversations
- Sanitize training data for model improvement
Compliance
- GDPR data minimization
- PCI DSS credit card protection
- HIPAA PHI redaction
Audit Trail
- Review masked content in Strata Cloud Manager
- Track sensitive data patterns
- Monitor masking effectiveness
Performance Considerations
- Offset Precision: Exact character positions for surgical masking
- Pattern Matching: Multiple patterns detected in single scan
- Masking Speed: Minimal latency impact
- Bulk Processing: Efficient handling of large texts