Detection Services

Prisma AIRS provides several detection services that scan for different categories of risk. Each service can be independently configured with an enforcement action in airs-config.json.

Available Services

Prompt Injection

Detects attempts to manipulate the AI agent by injecting adversarial instructions into the prompt. Includes jailbreak attempts, role-play attacks, and instruction override techniques.

Applies to: Prompts

Data Loss Prevention (DLP)

Detects sensitive data in prompts and responses, including:

Personal identifiable information (PII)
API keys and credentials
Financial data
Healthcare records

Applies to: Prompts and Responses

Toxicity

Detects harmful, offensive, or inappropriate content including hate speech, harassment, threats, and explicit material.

Applies to: Prompts and Responses

Malicious Code

Detects malicious code patterns in AI-generated responses using WildFire and Advanced Threat Prevention (ATP) engines. Catches reverse shells, credential stealers, obfuscated payloads, and known malware signatures.

Applies to: Responses (via code_response field)

Requires code extraction

Malicious code detection only triggers when code blocks are extracted from the AI response and sent in the code_response field. The code extractor handles this automatically.

URL Categorization

Detects suspicious or malicious URLs in AI responses. Checks URLs against Palo Alto Networks' URL filtering database.

Applies to: Responses

Custom Topics

Detects violations of custom topic policies configured in your AIRS security profile. Use this for organization-specific content policies.

Applies to: Prompts and Responses

Enforcement Configuration

{
  "enforcement": {
    "prompt_injection": "block",
    "dlp": "block",
    "malicious_code": "block",
    "url_categorization": "block",
    "toxicity": "block",
    "custom_topic": "block"
  }
}

Each service supports three actions:

Action	Behavior
`block`	Prevent the content from passing through
`mask`	Replace sensitive content and allow through
`allow`	Log the detection but allow through

When multiple services trigger on the same content, the strictest action wins.

Block Messages

When a prompt or response is blocked, the developer sees a formatted message:

AIRS -- Prompt Blocked

What happened: Your prompt was flagged by the Toxic Content security
check. Category: malicious Profile: Cursor IDE - Hooks

What to do:
- Review your prompt for sensitive data, injection patterns, or policy violations.
- Modify the prompt and try again.
- If you believe this is a false positive, contact your security team
and reference Scan ID: 0d874858-bbf1-4fcd-aa0f-6f91919a9d8e

Available Services​

Prompt Injection​

Data Loss Prevention (DLP)​

Toxicity​

Malicious Code​

URL Categorization​

Custom Topics​

Enforcement Configuration​

Block Messages​

Available Services

Prompt Injection

Data Loss Prevention (DLP)

Toxicity

Malicious Code

URL Categorization

Custom Topics

Enforcement Configuration

Block Messages