Runtime Safety, Compliance, and Guardrails for AI Applications
AgentIQ sits between your users and your AI models, inspecting every interaction in real time.
When a threat arrives, it blocks before the model responds. When a bad output is generated,
it blocks before the user sees it. It acts at the point of interaction, not after.
AgentIQ is Mirror Security's AI safety platform. It monitors AI applications at runtime
and stops threats before they produce harm.
Most AI safety tools are bolt-on filters that inspect text after the model has already
responded. AgentIQ intercepts interactions at the model boundary using a policy engine
and detection APIs. Threats are caught before the model generates a response.
Bad outputs are caught before they reach the user. It covers the full surface of AI risk:
malicious inputs, harmful outputs, privacy violations, factual inaccuracy, and agent tool abuse.
Architecture
How AgentIQ Works
AgentIQ intercepts at the model boundary in both directions. Every user message passes
through input checks before the LLM sees it. Every model response passes through output
checks before the user receives it. Both layers evaluate in real time against your deployed policies.
Seven detection capabilities, each exposed as a clean Python API. Use them individually or through the Unified Safety API for a single-call evaluation.
🔒
PII Detection and Redaction
sdk.agentiq.detect_pii()
Detects personally identifiable information in text: email addresses, phone numbers, SSNs, credit card numbers, and names. Returns detected entities, a redacted version, and a risk score. Configurable action: ALERT, REDACT, BLOCK, SANITIZE, or ALLOW.
Returns: entities list, redacted_text, risk_score
⚠️
Prompt Injection Detection
sdk.agentiq.detect_prompt_injection()
Detects attempts to override system instructions, extract system prompts, bypass safety measures, or manipulate model behaviour. Covers direct injection and indirect injection via retrieved RAG context. Returns a detection flag, injection-specific score, and overall confidence.
Returns: detected, prompt_injection, score
🛡️
Toxicity and Bias Detection
sdk.agentiq.detect_bias()
Analyses content for toxicity and bias across multiple dimensions simultaneously. Returns separate result objects for toxicity and bias, each with a detected flag and confidence score. Evaluated through the moderation service for contextual accuracy.
Returns: list of toxicity and bias result objects
🔍
Hallucination Detection
sdk.agentiq.analyze_hallucination()
Evaluates AI responses for factual accuracy using pair-based analysis. Compares model output against provided context. Configurable detection threshold (default 0.5). Returns pairs of analysis results, each with a final score and hallucination classification.
Returns: pairs list with is_hallucination, final_score
📊
RAG Quality: Context Quality
sdk.agentiq.analyze_context_quality()
Evaluates RAG system quality without requiring ground truth. Takes question, context, and LLM response. Returns quality score, relevance score, and accuracy score. Best for real-time production monitoring and A/B testing of RAG configurations.
Returns: metrics list (quality, relevance, accuracy)
✅
RAG Quality: Ground Truth
sdk.agentiq.analyze_ground_truth()
Validates factual accuracy when a reference answer is available. Returns faithfulness, answer correctness, context precision, context recall, entity recall, and answer similarity. Best for model evaluation, benchmarking, and training data quality assessment.
Run all safety checks in a single call. Results are keyed by check name so you can
map each result back to its detection service. Checks auto-enable based on which
inputs you provide. Run in serial (default, deterministic) or parallel (concurrent, faster).
Check
Required Inputs
Returns
prompt_injection
text or conversation
detected, score, prompt_injection
toxicity
text or conversation
detected, score
bias
text or conversation
detected, score
pii
text or conversation
entities, redacted_text, risk_score
context_quality
question, context, llm_response
metrics (quality, relevance, accuracy)
ground_truth
question, context, llm_response, ground_truth
faithfulness, correctness, precision, recall
hallucination
input_text, output_text
pairs with is_hallucination
Policy Engine
AgentIQ Policy Engine
The Policy Engine defines what gets blocked, what gets allowed, and what gets flagged.
Policies are enforced in real time at the model boundary.
Policies evaluate against named resources: message input, message output,
tool_call, tool_output, rag, and embedding.
Rules are declarative: deny rules block when a condition is true; allow rules permit specific traffic after deny rules;
check statements invoke the detection APIs for deeper semantic evaluation.
Three Ways to Create Policies
01
Natural Language Workbench
Portal → AgentIQ → Policy Manager → Policy Workbench.
Describe what you want to protect in plain English. The engine converts
it to a compilable DSL policy. Test it. Deploy it. No DSL knowledge required.
02
12 Pre-built Policies
Copy-paste ready. 12 policies covering PII blocking, injection prevention,
toxicity filtering, hallucination detection, SQL injection, file security,
RAG pipeline protection, and full production security chains.
03
Custom DSL
Full grammar control. Write conditional policies, policy chains, tool call controls,
threshold-based checks, and lambda expressions over collections.
See the Grammar Reference for complete syntax documentation.
12 Ready-Made Policies
Policy
Protects Against
Deploy When
block_pii_input
PII in user messages: SSN, credit card, phone
Any app handling user input
block_pii_output
PII leaked in model responses
Healthcare, finance, customer-facing
prevent_injection
Prompt injection and jailbreak attempts
All production deployments
input_limits
Empty inputs and token abuse over 10,000 characters
APIs with usage quotas
content_safety
Toxic and biased model output
Public chatbots, content generation
hallucination_detection
Factually incorrect model output
RAG systems, knowledge Q&A
file_security
Agents reading sensitive system files (/etc/, .env)
Coding assistants, DevOps agents
sql_security
SQL injection patterns in agent-generated queries
Database agents, data analysis
network_security
SSRF via agent HTTP requests to internal addresses
Web scraping agents, API integrations
rag_security
Tampered sources, context manipulation, poisoning
Enterprise RAG, document Q&A
embedding_security
Embedding poisoning in vector database pipelines
Semantic search, RAG with external sources
production_security
Full chain: input, output, and tool protection
Any production AI deployment
Output and Behaviour Check Types
check_output hallucination
Model output against context for factual grounding. Configurable threshold.
check_output toxicity
Harmful or offensive language in model response.
check_output bias
Biased language across multiple demographic dimensions.
check_output pii
PII in model response before delivery to user.
check_output factual_consistency
Internal consistency of the model's response.
check_output code_injection
Injected executable code patterns in model output.
check_prompt injection
Injection attempts in input before model sees it. Configurable threshold.
check_model personality_drift
Model deviating from its intended persona or instruction set.
check_model instruction_adherence
Model following system prompt and operational instructions.
check_rag source_verification
Retrieved document authenticity and relevance in RAG context.
check_rag embedding_attack
Embedding poisoning patterns in vectors retrieved from the database.
check_rag data_poisoning
Malicious content injected into the RAG knowledge base.
Integration Methods
Method
Description
Best For
Python Decorator
Add @policy_monitor to any async function. Policy engine wraps it automatically with no changes to your logic.
Fastest integration; single-function protection
Programmatic API
Use PolicyAPIService to create, deploy, and manage policies from code. Full lifecycle control via save_policy() and deploy_policy().
Infrastructure-as-code, CI/CD pipeline management
Policy Workbench
Portal UI. Describe your policy in plain English, click Generate Policy, test it in the playground, deploy. No DSL knowledge needed.
Non-technical teams, rapid policy drafting
Agent Identity
AgentID: AI Agent Identity and Fingerprinting
AgentID gives each AI agent a verifiable cryptographic identity. When agents
act autonomously, you need to know exactly which agent took which action,
under what authority, and whether its behaviour has drifted from its defined profile.
AgentID
Verified Identity for Every Agent in Your Ecosystem
In multi-agent systems, agents call tools, spawn sub-agents, and make decisions autonomously.
Without identity, there is no accountability. AgentID assigns a cryptographic fingerprint
to each agent, tracks its behaviour over time, and alerts when it deviates from its baseline.
99.5%
ID Accuracy
<5ms
Verify Time
Drift Detection
96%
Spoofing Prevention
100%
Audit Coverage
100%
Multi-agent Support
Yes
🔐 Cryptographic Fingerprinting
Each agent receives a unique cryptographic identity on registration. Identity is bound to the agent's model, version, system prompt hash, and authorised tool set.
📊 Behavioural Baseline
AgentID learns the normal behaviour profile for each agent: typical tool call patterns, response latency distributions, and semantic consistency scores.
⚡ Drift Alerting
When an agent deviates from its baseline, such as calling tools it has not called before, or shifting its response semantics, AgentID raises an alert and can suspend the agent.
🔗 Chain of Custody
In multi-agent pipelines, every hand-off is logged with the sending and receiving agent identities. You have a complete, unforgeable audit trail of every agent action.
🛡️
Impersonation
Blocked
Cryptographic proof of identity
📡
Prompt Injection via Agent
Traced
Source agent identified
🔄
Goal Hijacking
Detected
Drift from authorised objectives
🔑
Privilege Escalation
Blocked
Tool scope enforced at identity level
📋
Audit Gap
Eliminated
100% action attribution
⚠️
Model Substitution
Caught
Version hash mismatch triggers alert
Use Cases
Use Cases by Industry
Financial Services
Regulatory Compliance and Fraud
PII protection for customer financial data. Accuracy evaluation for AI-generated financial advice. SQL injection prevention for database agents. Audit trail evidence via policy telemetry for regulatory reporting.
Healthcare
PHI Protection and Clinical AI
PHI detection and redaction in clinical chatbots. Hallucination detection for clinical decision support tools. Bias detection in diagnostic AI. HIPAA-aligned audit logging through policy engine telemetry.
Enterprise AI
Internal Agent Guardrails
System prompt protection across internal agent deployments. Tool call controls for agents with file, database, and network access. RAG pipeline integrity checks across enterprise knowledge bases.
Customer Support
Public-Facing Chatbot Safety
Toxic content filtering for customer-facing chatbots. PII protection for customer data submitted through conversation. Guardrails for automated response and escalation systems.
Technical Reference
Technical Specifications
SDK package
pip install mirror-sdk
Python version
3.9 and above
API endpoint
https://mirrorapi.azure-api.net/v1
Authentication
MIRROR_API_KEY environment variable
Response time SLA
Sub-200ms for real-time applications
Uptime SLA
99.9% with global edge deployment
Scaling
Horizontal scaling; intelligent caching for throughput optimisation
12 ready-made policies across input, output, tool, and RAG protection
Platform
How AgentIQ Fits with Other Mirror Products
AgentIQ
Runtime Behaviour Layer
This product. Monitors semantic content and enforces policies at the model boundary in real time.
VectaX
Data Encryption Layer
Encrypts the underlying data. AgentIQ monitors the content layer on top of the encryption VectaX provides. The two products are designed to be deployed together.
DiscoveR
Security Testing Layer
Automated red teaming that finds weaknesses in the same applications AgentIQ protects at runtime. Use DiscoveR to validate your AgentIQ policies are effective.
Product Report
Download the AgentIQ Report
Get a complete PDF of this product page including architecture diagrams,
capability tables, use cases, and technical specifications.
Ready to share with your team or attach to a procurement brief.