Fail Modes Guide¶
Understanding and configuring fail-open vs fail-closed behavior.
Overview¶
When a scan fails (API error, timeout, network issue), the plugin must decide:
| Mode | On Failure | Security | Availability |
|---|---|---|---|
| Fail-Closed | Block request | High | Lower |
| Fail-Open | Allow request | Lower | High |
Configuration¶
plugins:
prisma-airs:
fail_closed: true # default - block on failure
# or
fail_closed: false # allow on failure
Fail-Closed (Default)¶
Behavior¶
When scan fails:
- Create synthetic "block" result
- Cache it for downstream hooks
- Inject warning into agent context
- Block dangerous tools
- Block outbound with error message
Synthetic Result¶
{
"action": "block",
"severity": "CRITICAL",
"categories": ["scan-failure"],
"scanId": "",
"reportId": "",
"profileName": "default",
"promptDetected": {
"injection": false,
"dlp": false,
"urlCats": false,
"toxicContent": false,
"maliciousCode": false,
"agent": false,
"topicViolation": false
},
"responseDetected": {
"dlp": false,
"urlCats": false,
"dbSecurity": false,
"toxicContent": false,
"maliciousCode": false,
"agent": false,
"ungrounded": false,
"topicViolation": false
},
"latencyMs": 0,
"timeout": false,
"hasError": true,
"contentErrors": [],
"error": "Scan failed: connection timeout"
}
When to Use¶
- Security-critical applications
- Handling sensitive data
- Compliance requirements
- When attacks during outages are high-risk
Trade-offs¶
Pros:
- Attacks cannot succeed during outages
- Conservative security posture
- Predictable behavior
Cons:
- Service disruption during API issues
- User frustration with failed requests
- Requires monitoring for false blocks
Fail-Open¶
Behavior¶
When scan fails:
- Log error
- No cached result
- No warning injected
- No tool blocking
- Response sent without scanning
When to Use¶
- High-availability requirements
- Low-risk applications
- When API reliability is a concern
- Development/testing environments
Trade-offs¶
Pros:
- Service continues during outages
- Better user experience
- No false positive blocks
Cons:
- Attacks can succeed during outages
- Security gap during API issues
- Potential compliance concerns
Per-Hook Behavior¶
prisma-airs-audit (message_received)¶
// Fail-closed
if (config.failClosed) {
cacheScanResult(sessionKey, {
action: "block",
categories: ["scan-failure"],
error: err.message,
});
}
// Fail-open: no cache entry
prisma-airs-context (before_agent_start)¶
// Fail-closed
if (config.failClosed) {
return {
prependContext: buildWarning({
action: "block",
categories: ["scan-failure"],
}),
};
}
// Fail-open: return nothing
prisma-airs-outbound (message_sending)¶
// Fail-closed
if (config.failClosed) {
return {
content: "Unable to provide response due to security verification issue.",
};
}
// Fail-open: return nothing (send original)
prisma-airs-tools (before_tool_call)¶
No direct fail mode—uses cached result from audit hook.
Monitoring¶
Scan Failure Events¶
{
"event": "prisma_airs_inbound_scan_error",
"timestamp": "2024-01-15T10:30:00.000Z",
"sessionKey": "session_abc123",
"error": "API error 503: Service temporarily unavailable"
}
Block Due to Failure¶
{
"event": "prisma_airs_tool_block",
"categories": ["scan-failure"],
"reason": "Scan failed: connection timeout"
}
Hybrid Approaches¶
Partial Fail-Closed¶
Enable fail-closed only for certain hooks:
plugins:
prisma-airs:
# Fail-closed for enforcement
fail_closed: true
# Disable certain hooks to reduce impact
context_injection_enabled: false
tool_gating_enabled: false
# Keep outbound scanning
outbound_scanning_enabled: true
This blocks outbound violations but doesn't block tool calls on scan failure.
Monitoring Mode¶
Log failures but don't block:
plugins:
prisma-airs:
fail_closed: false
audit_enabled: true
context_injection_enabled: false
outbound_scanning_enabled: false
tool_gating_enabled: false
Review logs to understand failure patterns before enabling enforcement.
Failure Scenarios¶
API Timeout¶
Cause: AIRS API slow to respond (>30s)
fail_closed: true → Block request
fail_closed: false → Allow request
Network Error¶
Cause: Cannot reach api.aisecurity.paloaltonetworks.com
fail_closed: true → Block request
fail_closed: false → Allow request
Invalid API Key¶
Cause: API key invalid or expired
Response: 401 Unauthorized
fail_closed: true → Block request
fail_closed: false → Allow request
Rate Limiting¶
Cause: Too many requests
Response: 429 Too Many Requests
fail_closed: true → Block request
fail_closed: false → Allow request
Best Practices¶
1. Start with Fail-Closed¶
Default is fail-closed for good reason. Only change after understanding implications.
2. Monitor Failure Rates¶
Track scan failures:
If failures are frequent, investigate root cause before switching to fail-open.
3. Set Up Alerts¶
Alert on:
- Scan failure rate > 1%
- Consecutive failures > 5
- Error types (timeout, auth, network)
4. Have a Fallback Plan¶
If switching to fail-open:
- Increase other security layers
- Add rate limiting
- Enable additional logging
- Consider secondary scanning service
5. Document the Decision¶
Record why you chose fail-open (if applicable):
- Business justification
- Risk acceptance
- Compensating controls
- Review date