Design and Enforce an AgentIQ Policy for a Production Agent

The fundamental problem

Why system prompts are not policies

Enterprise AI teams spend a lot of time writing system prompts. "You are a helpful assistant. Do not share competitor information. Never reveal customer data. If asked to do something harmful, refuse." These instructions are reasonable. They are also not security controls.

A system prompt is part of the model's context. It influences the model's behaviour. It does not enforce it. A skilled attacker with a jailbreak, an indirect prompt injection through a retrieved document, or a multi-step manipulation sequence can override a system prompt instruction without the model flagging it as a violation. The model genuinely believes it is being helpful.

AgentIQ policies run outside the model's context. They evaluate inputs before the model sees them and outputs before they reach the caller. The model cannot override a policy by being convinced to. A prompt injection that makes the model want to call a restricted tool is stopped at the tool-call layer before the call executes.

The key distinction

System prompts operate inside the trust boundary of the model. AgentIQ policies operate outside it. An attacker who compromises the model's reasoning cannot bypass the policy engine. The policy engine does not ask the model whether a call should be allowed. It evaluates independently.

Architecture

How the AgentIQ Policy Engine works

The Policy Engine sits between your application and the model. Every interaction passes through it. It has four evaluation points:

Input evaluation

The user's message is checked before it reaches the model. PII detection, injection detection, jailbreak detection, length limits. A deny rule here stops the call immediately.

message input

Tool-call evaluation

When the model initiates a tool call, the call is evaluated before execution. Function name, argument content, and target can all be restricted by policy. A deny rule here stops the tool call.

tool_call

Tool-output evaluation

The result returned by a tool is checked before it enters the model's context. PII in tool output can be caught here before the model incorporates it into a response.

tool_output

Output evaluation

The model's response is checked before it reaches the caller. Hallucination detection, toxicity, bias, PII leakage. A failed check here blocks the response.

message output

The engine also collects telemetry on every evaluation. When telemetry is enabled in MirrorConfig, every policy decision is recorded with timestamp, input hash, rules that fired, and outcome. This is the audit trail.

Language reference

Mirror Policy DSL basics

Every policy file has the same structure. Version declaration first, then optional metadata headers, then one or more policy blocks.

dslbasic_structure.policy

@version "1.0.0";
@author "Security Team";
@last_modified "2026-04-10";

metadata {
  description: "Policy for the enterprise internal assistant";
  security_level: HIGH;
  tags: ["production", "enterprise"];
}

policy my_policy {
  // deny rules block the interaction when condition is true
  deny message input  where check_pii() == true;
  deny message output where check_pii() == true;

  // allow rules permit after a deny (allowlist pattern)
  allow tool_call where function.name == "safe_search";
  deny  tool_call where true;  // deny everything else

  // check statements evaluate output quality
  check_output hallucination with { threshold: 0.85 };
}

Three things matter about the DSL syntax. Operators are C-style: use &&, ||, ! not Python's and, or, not. Strings use double quotes only. Every statement ends with a semicolon. The policy will not compile if you get any of these wrong.

Resource	Evaluates	Use for
`message input`	User's message before the model sees it	PII detection, injection detection, jailbreak, length limits
`message output`	Model's response before it reaches the caller	PII leakage, hallucination, toxicity, bias
`tool_call`	Function call initiated by the model	Restrict file operations, block dangerous SQL, SSRF prevention
`tool_output`	Result returned by a tool	PII in tool results, content from untrusted sources
`rag`	Retrieved context before injection	Source verification, context manipulation, data poisoning
`embedding`	Embedding vectors	Tamper detection, semantic drift

Layer 1

Input protection layer

Input protection runs before the model. It is the cheapest layer to enforce and the one that stops the most attacks. Every production agent needs at minimum: PII detection, prompt injection detection, jailbreak detection, and length limits.

dslinput_protection.policy

@version "1.0.0";

policy input_protection {

  // Reject empty inputs
  deny message input where length(content) == 0;

  // Reject inputs over 10,000 characters
  // Prevents token flooding and context stuffing attacks
  deny message input where length(content) > 10000;

  // Block PII: SSNs, credit cards, phone numbers in input
  // Prevents accidental data ingestion into AI context
  deny message input where check_pii() == true;

  // Block prompt injection attempts
  deny message input where check_prompt_injection() == true;

  // Block jailbreak attempts
  deny message input where detect_jailbreak() == true;

  // Higher confidence injection blocking for production
  check_prompt injection with { threshold: 0.9, enabled: true };

}

Threshold tuning matters

The default prompt injection threshold is 0.5. At that level you will see false positives on legitimate complex queries. For a general enterprise assistant, 0.7 to 0.8 is the right starting point. If your agent handles a specific domain with predictable inputs, you can push to 0.9. Start conservative and tune based on blocked legitimate queries surfaced in telemetry.

Layer 2

Output protection layer

Output protection runs after the model generates a response but before it reaches the caller. This is where you catch PII the model might have included from retrieved context, hallucinated facts, and toxic content.

dsloutput_protection.policy

@version "1.0.0";

policy output_protection {

  // Block PII in output
  // Customer data from retrieved context must not appear in responses
  deny message output where check_pii() == true;

  // Detect hallucination
  // 0.85 threshold: blocks responses with significant ungrounded claims
  check_output hallucination with { threshold: 0.85 };

  // Verify factual consistency with retrieved context
  check_output factual_consistency;

  // Block toxic and biased content in customer-facing responses
  check_output toxicity;
  check_output bias;

  // Block code injection patterns in output
  // Prevents the model from generating malicious scripts
  check_output code_injection;

  // Monitor for prompt reflection
  // Catches cases where injected instructions appear in output
  check_output prompt_reflection;

}

Layer 3

Tool-call security policies

Tool-call policies are the most powerful layer for agentic systems. They intercept the model's function calls before execution. This is where you enforce least privilege on the agent: it can only call what you explicitly allow, and only within the parameters you define.

File operation controls

dslfile_security.policy

@version "1.0.0";

policy file_security {

  // Block access to system files and credentials
  deny tool_call where
    function.name == "read_file" &&
    starts_with(function.arguments, "/etc/");

  deny tool_call where
    function.name == "read_file" &&
    contains(function.arguments, ".ssh/");

  deny tool_call where
    function.name == "read_file" &&
    contains(function.arguments, ".env");

  // Allow reads from the designated safe directory only
  allow tool_call where
    function.name == "read_file" &&
    starts_with(function.arguments, "/workspace/data/");

  // Deny any read_file call not in the allowlist above
  deny tool_call where function.name == "read_file";

  // Block all write operations outright
  deny tool_call where function.name == "write_file";
  deny tool_call where function.name == "delete_file";

}

Database and network controls

dsldb_network_security.policy

@version "1.0.0";

policy db_network_security {

  // Block SQL injection patterns
  deny tool_call where
    function.name == "execute_sql" &&
    (icontains(function.arguments, "OR 1=1") ||
     icontains(function.arguments, "UNION SELECT") ||
     icontains(function.arguments, "DROP TABLE") ||
     icontains(function.arguments, "DELETE FROM") ||
     icontains(function.arguments, "--"));

  // Block SSRF: prevent access to internal services
  deny tool_call where
    function.name == "http_request" &&
    contains(function.arguments, "localhost");

  deny tool_call where
    function.name == "http_request" &&
    contains(function.arguments, "127.0.0.1");

  deny tool_call where
    function.name == "http_request" &&
    (contains(function.arguments, "192.168.") ||
     contains(function.arguments, "10.0.")    ||
     contains(function.arguments, "172.16."));

  // Allow HTTP only to external HTTPS endpoints
  allow tool_call where
    function.name == "http_request" &&
    starts_with(function.arguments, "https://");

  deny tool_call where function.name == "http_request";

  // Check tool output for PII from external sources
  deny tool_output where detect_pii(tool_output.content) == true;

}

The default should be deny

In the file_security policy above, reads that do not match the allowlist are explicitly denied. This is the correct pattern. An agent that can call read_file on any path it reasons is appropriate is an agent that a prompt injection attack can direct to read your secrets. Always enumerate what is allowed and deny everything else.

Layer 4

RAG source verification

If your agent uses retrieval, the retrieved content is an attack surface. Context manipulation attacks embed adversarial instructions in documents that get retrieved into the model's context. Data poisoning plants malicious content in your vector store. These policies run on the RAG context before it is injected.

dslrag_security.policy

@version "1.0.0";

policy rag_security {

  // Verify retrieved documents are from trusted sources
  check_rag source_verification;

  // Detect if context has been manipulated to contain instructions
  check_rag context_manipulation;

  // Detect data poisoning in the retrieval pipeline
  check_rag data_poisoning;

  // Validate retrieval quality and integrity
  validate_retrieval {
    source_authenticity;
    content_relevance;
    chunk_integrity;
  };

}

policy embedding_security {

  // Detect embedding tampering and poisoning attacks
  check_rag embedding_attack;

  validate_retrieval {
    embedding_similarity;
    vector_consistency;
    semantic_drift;
  };

}

Composition

Production security chain

In production you do not apply policies one at a time. You compose them into a chain. A chain groups related policies and applies them in order. If any policy in the chain denies, the call is blocked.

dslenterprise_agent.policy

@version "1.0.0";
@author "Security Team";
@last_modified "2026-04-10";

metadata {
  description: "Production security chain for the enterprise internal AI assistant";
  security_level: CRITICAL;
  tags: ["production", "enterprise", "internal-assistant"];
}

chain enterprise_agent_security {

  policy input_layer {
    deny message input where length(content) == 0;
    deny message input where length(content) > 10000;
    deny message input where check_pii() == true;
    deny message input where check_prompt_injection() == true;
    deny message input where detect_jailbreak() == true;
  }

  policy tool_layer {
    // File access: safe directory only
    allow tool_call where
      function.name == "read_file" &&
      starts_with(function.arguments, "/workspace/data/");
    deny tool_call where function.name == "read_file";

    // SQL: block destructive patterns
    deny tool_call where
      function.name == "execute_sql" &&
      (icontains(function.arguments, "DROP") ||
       icontains(function.arguments, "DELETE FROM") ||
       icontains(function.arguments, "UNION SELECT"));

    // HTTP: external HTTPS only
    allow tool_call where
      function.name == "http_request" &&
      starts_with(function.arguments, "https://");
    deny tool_call where function.name == "http_request";

    // Tool output: no PII from external sources
    deny tool_output where detect_pii(tool_output.content) == true;
  }

  policy rag_layer {
    check_rag source_verification;
    check_rag context_manipulation;
    check_rag data_poisoning;
  }

  policy output_layer {
    deny message output where check_pii() == true;
    check_output hallucination with { threshold: 0.85 };
    check_output toxicity;
    check_output bias;
    check_output code_injection;
  }

}

Deployment option 1

Deploy with the policy_monitor decorator

The decorator is the fastest path to enforcement. One import, one decorator above your function, done. The named policy is evaluated on every call.

pythonagent.py

from mirror_sdk.ops.mirror_decorators import policy_monitor
from mirror_sdk.core.mirror_core import MirrorConfig
import openai

config = MirrorConfig.from_env()

# The policy named here must exist in AgentIQ
# Upload enterprise_agent.policy via PolicyAPIService first
@policy_monitor(name="enterprise_agent_security", mirror_config=config)
async def run_agent(user_query: str) -> str:
    """
    This function is wrapped by the policy engine.
    The policy evaluates user_query before this code runs.
    The return value is checked by the output layer before it reaches the caller.
    If any policy rule fires, this function is not called (input deny)
    or the return value is blocked (output deny).
    """
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": "You are an internal enterprise assistant."},
            {"role": "user",   "content": user_query}
        ]
    )
    return response.choices[0].message.content


# Usage
import asyncio

async def main():
    result = await run_agent("Summarise last quarter's performance metrics")
    print(result)

asyncio.run(main())

What happens when a policy fires

If an input deny rule fires, run_agent is never called. The caller receives a policy violation response. If an output deny rule fires, the return value from run_agent is blocked before it reaches the caller. In both cases, the event is recorded in the AgentIQ telemetry log with the rule that fired.

Deployment option 2

Deploy with PolicyAPIService

The programmatic API is for teams that need to manage policies dynamically: updating rules without code changes, building policy management tooling, or pushing emergency updates to a deployed agent.

pythonpolicy_management.py

import asyncio
from mirror_sdk.ops.mirror_agentiq_policy_api import PolicyAPIService, PolicyCreate
from mirror_sdk.core.mirror_core import MirrorConfig

config = MirrorConfig.from_env()
policy_service = PolicyAPIService(config)

async def deploy_production_policy():
    # Read the policy DSL file
    with open("enterprise_agent.policy") as f:
        policy_text = f.read()

    # Create and save the policy
    new_policy = PolicyCreate(
        policy_name="enterprise_agent_security",
        policy_text=policy_text
    )
    saved = await policy_service.save_policy(new_policy)
    print(f"Policy saved: {saved['_id']}")

    # Deploy it (makes it active immediately)
    await policy_service.deploy_policy(saved["_id"])
    print("Policy deployed. All @policy_monitor calls referencing this name now use the new version.")

async def list_active_policies():
    policies = policy_service.get_all_deployed_policies()
    print(f"Active policies: {len(policies)}")
    for p in policies:
        print(f"  {p['policy_name']} - deployed at {p.get('deployed_at', 'unknown')}")

async def emergency_policy_update(new_rule: str):
    """
    Push an emergency rule update without redeploying the agent.
    Useful when a new attack pattern is discovered in production.
    """
    # Save updated policy and deploy immediately
    updated_policy = PolicyCreate(
        policy_name="enterprise_agent_security",
        policy_text=new_rule
    )
    saved = await policy_service.save_policy(updated_policy)
    await policy_service.deploy_policy(saved["_id"])
    print("Emergency policy update deployed without agent restart.")

asyncio.run(deploy_production_policy())

Before going live

Testing and telemetry

Before deploying a policy to production, test it in the Policy Workbench: Portal then AgentIQ then Policy Manager then Policy Playground. The Playground lets you send test inputs and see exactly which rules fire and why, without affecting your live system.

In production, enable telemetry to get a structured audit trail of every policy decision.

pythontelemetry_config.py

from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig

# Telemetry must be enabled in config for audit trail collection
config = MirrorConfig(
    api_key="your-api-key",
    server_url="https://mirrorapi.azure-api.net/v1",
    telemetry_enabled=True,
    policy_eval_enabled=True,
    # Max retries on transient failures
    max_retries=3,
    # How often the SDK polls for policy updates (seconds)
    polling_interval=300
)

sdk = MirrorSDK(config)

# With telemetry enabled, every policy evaluation is recorded:
# - timestamp
# - input hash (not the input itself)
# - policy name and version evaluated
# - rules that fired
# - outcome: allowed or blocked
# - latency
print("SDK configured with telemetry and policy eval enabled")

Testing stage	Tool	Purpose
Policy authoring	Policy Workbench (natural language to DSL)	Generate policy draft from plain English; faster than writing DSL manually
Policy validation	Policy Playground	Send test inputs, verify rules fire correctly, catch syntax errors
Pre-production	DiscoveR quickScan on staging	Adversarial test of the full agent including policy enforcement
Production monitoring	AgentIQ telemetry	Audit trail of every policy decision; surface false positives for threshold tuning
Ongoing	DiscoveR quarterly scan	Verify policies still hold under evolving attack patterns

Common questions

FAQ

What is the Mirror Policy DSL and how does it differ from writing rules in Python?

The Mirror Policy DSL is a declarative language designed specifically for expressing AI security policies. Unlike Python code that runs inside your application, DSL policies are evaluated by the AgentIQ Policy Engine at runtime, outside your application logic. This means they cannot be bypassed by a jailbreak that compromises the model, they are auditable by security teams who do not write Python, and they can be updated and redeployed without restarting your application.

What is the difference between the policy_monitor decorator and the PolicyAPIService?

The policy_monitor decorator is the simplest integration path. You add one decorator above your agent function and the named policy is evaluated on every call. The PolicyAPIService is the programmatic interface for creating, saving, deploying, listing, and removing policies at runtime. Use the decorator for straightforward enforcement. Use PolicyAPIService when you need to manage policies dynamically, update them without code changes, or build policy management tooling.

Can AgentIQ policies be updated without restarting a production agent?

Yes. Policies saved and deployed via PolicyAPIService take effect without restarting the application. The policy_monitor decorator evaluates the policy by name at call time, so deploying a new version of a named policy changes the behaviour of all decorated functions that reference that name. This allows you to respond to a discovered vulnerability by pushing a policy update rather than redeploying the entire agent.

How do tool-call policies work in AgentIQ?

Tool-call policies intercept function calls that the AI model initiates before they are executed. You can deny specific tool calls by function name, by argument content, or by combinations of both. For example, you can allow the read_file tool only when the path starts with a safe directory, or block the execute_sql tool when the argument contains DROP or DELETE patterns. Tool-call policies apply regardless of whether the tool call was in the original plan or triggered by a prompt injection attack.

What triggers the policy_monitor decorator to block a call?

Any deny rule in the active policy that evaluates to true blocks the call. The function decorated with policy_monitor is not executed. The caller receives a policy violation response indicating the call was blocked. If the policy includes check_output statements, those are evaluated against the function's return value before it is passed back to the caller. A failed output check also blocks the response.

How does AgentIQ policy telemetry support compliance audits?

When telemetry is enabled in MirrorConfig, AgentIQ records every policy evaluation including the input that was evaluated, the rules that fired, whether the call was allowed or blocked, and a timestamp. This creates a structured audit trail of every policy decision. For regulated industries, this telemetry provides evidence that policies were enforced consistently, which satisfies audit requirements for AI system governance under frameworks like EU AI Act Article 9 and NIST AI RMF.

Next: Sovereign AI deployment with FIPS 140-3 and air-gapped FHE

Module H2 covers deployments where data must never leave the organisation's physical control. Air-gapped FHE, FIPS 140-3 key management, and the architecture for running VectaX in environments with no external connectivity.

Continue to H2 → AgentIQ documentation →