AgentIQ – Runtime AI Safety and Compliance

Overview

What is AgentIQ?

AgentIQ is Mirror Security's AI safety platform. It monitors AI applications at runtime and stops threats before they produce harm.

Most AI safety tools are bolt-on filters that inspect text after the model has already responded. AgentIQ intercepts interactions at the model boundary using a policy engine and detection APIs. Threats are caught before the model generates a response. Bad outputs are caught before they reach the user. It covers the full surface of AI risk: malicious inputs, harmful outputs, privacy violations, factual inaccuracy, and agent tool abuse.

Architecture

How AgentIQ Works

AgentIQ intercepts at the model boundary in both directions. Every user message passes through input checks before the LLM sees it. Every model response passes through output checks before the user receives it. Both layers evaluate in real time against your deployed policies.

Runtime Interception Model

User Input

Query / prompt

→

Input Check

Injection · PII · Jailbreak

BLOCK if threat

→

LLM / Agent

Model generates response

→

Output Check

Hallucination · Toxicity · PII

BLOCK if unsafe

→

Safe Output

Policy-compliant response

Detection APIs active: PII Detection Prompt Injection Hallucination Toxicity + Bias RAG Quality Policy Engine

Detection Capabilities

Detection APIs

Seven detection capabilities, each exposed as a clean Python API. Use them individually or through the Unified Safety API for a single-call evaluation.

🔒

PII Detection and Redaction

sdk.agentiq.detect_pii()

Detects personally identifiable information in text: email addresses, phone numbers, SSNs, credit card numbers, and names. Returns detected entities, a redacted version, and a risk score. Configurable action: ALERT, REDACT, BLOCK, SANITIZE, or ALLOW.

Returns: entities list, redacted_text, risk_score

⚠️

Prompt Injection Detection

sdk.agentiq.detect_prompt_injection()

Detects attempts to override system instructions, extract system prompts, bypass safety measures, or manipulate model behaviour. Covers direct injection and indirect injection via retrieved RAG context. Returns a detection flag, injection-specific score, and overall confidence.

Returns: detected, prompt_injection, score

🛡️

Toxicity and Bias Detection

sdk.agentiq.detect_bias()

Analyses content for toxicity and bias across multiple dimensions simultaneously. Returns separate result objects for toxicity and bias, each with a detected flag and confidence score. Evaluated through the moderation service for contextual accuracy.

Returns: list of toxicity and bias result objects

🔍

Hallucination Detection

sdk.agentiq.analyze_hallucination()

Evaluates AI responses for factual accuracy using pair-based analysis. Compares model output against provided context. Configurable detection threshold (default 0.5). Returns pairs of analysis results, each with a final score and hallucination classification.

Returns: pairs list with is_hallucination, final_score

📊

RAG Quality: Context Quality

sdk.agentiq.analyze_context_quality()

Evaluates RAG system quality without requiring ground truth. Takes question, context, and LLM response. Returns quality score, relevance score, and accuracy score. Best for real-time production monitoring and A/B testing of RAG configurations.

Returns: metrics list (quality, relevance, accuracy)

✅

RAG Quality: Ground Truth

sdk.agentiq.analyze_ground_truth()

Validates factual accuracy when a reference answer is available. Returns faithfulness, answer correctness, context precision, context recall, entity recall, and answer similarity. Best for model evaluation, benchmarking, and training data quality assessment.

Returns: faithfulness, answer_correctness, precision, recall

Unified Safety API

Run all safety checks in a single call. Results are keyed by check name so you can map each result back to its detection service. Checks auto-enable based on which inputs you provide. Run in serial (default, deterministic) or parallel (concurrent, faster).

Check	Required Inputs	Returns
prompt_injection	text or conversation	detected, score, prompt_injection
toxicity	text or conversation	detected, score
bias	text or conversation	detected, score
pii	text or conversation	entities, redacted_text, risk_score
context_quality	question, context, llm_response	metrics (quality, relevance, accuracy)
ground_truth	question, context, llm_response, ground_truth	faithfulness, correctness, precision, recall
hallucination	input_text, output_text	pairs with is_hallucination

Policy Engine

AgentIQ Policy Engine

The Policy Engine defines what gets blocked, what gets allowed, and what gets flagged. Policies are enforced in real time at the model boundary.

Policies evaluate against named resources: message input, message output, tool_call, tool_output, rag, and embedding. Rules are declarative: deny rules block when a condition is true; allow rules permit specific traffic after deny rules; check statements invoke the detection APIs for deeper semantic evaluation.

Three Ways to Create Policies

01

Natural Language Workbench

Portal → AgentIQ → Policy Manager → Policy Workbench. Describe what you want to protect in plain English. The engine converts it to a compilable DSL policy. Test it. Deploy it. No DSL knowledge required.

02

12 Pre-built Policies

Copy-paste ready. 12 policies covering PII blocking, injection prevention, toxicity filtering, hallucination detection, SQL injection, file security, RAG pipeline protection, and full production security chains.

03

Custom DSL

Full grammar control. Write conditional policies, policy chains, tool call controls, threshold-based checks, and lambda expressions over collections. See the Grammar Reference for complete syntax documentation.

12 Ready-Made Policies

Policy	Protects Against	Deploy When
block_pii_input	PII in user messages: SSN, credit card, phone	Any app handling user input
block_pii_output	PII leaked in model responses	Healthcare, finance, customer-facing
prevent_injection	Prompt injection and jailbreak attempts	All production deployments
input_limits	Empty inputs and token abuse over 10,000 characters	APIs with usage quotas
content_safety	Toxic and biased model output	Public chatbots, content generation
hallucination_detection	Factually incorrect model output	RAG systems, knowledge Q&A
file_security	Agents reading sensitive system files (/etc/, .env)	Coding assistants, DevOps agents
sql_security	SQL injection patterns in agent-generated queries	Database agents, data analysis
network_security	SSRF via agent HTTP requests to internal addresses	Web scraping agents, API integrations
rag_security	Tampered sources, context manipulation, poisoning	Enterprise RAG, document Q&A
embedding_security	Embedding poisoning in vector database pipelines	Semantic search, RAG with external sources
production_security	Full chain: input, output, and tool protection	Any production AI deployment

Output and Behaviour Check Types

check_output hallucination

Model output against context for factual grounding. Configurable threshold.

check_output toxicity

Harmful or offensive language in model response.

check_output bias

Biased language across multiple demographic dimensions.

check_output pii

PII in model response before delivery to user.

check_output factual_consistency

Internal consistency of the model's response.

check_output code_injection

Injected executable code patterns in model output.

check_prompt injection

Injection attempts in input before model sees it. Configurable threshold.

check_model personality_drift

Model deviating from its intended persona or instruction set.

check_model instruction_adherence

Model following system prompt and operational instructions.

check_rag source_verification

Retrieved document authenticity and relevance in RAG context.

check_rag embedding_attack

Embedding poisoning patterns in vectors retrieved from the database.

check_rag data_poisoning

Malicious content injected into the RAG knowledge base.

Integration Methods

Method	Description	Best For
Python Decorator	Add `@policy_monitor` to any async function. Policy engine wraps it automatically with no changes to your logic.	Fastest integration; single-function protection
Programmatic API	Use `PolicyAPIService` to create, deploy, and manage policies from code. Full lifecycle control via `save_policy()` and `deploy_policy()`.	Infrastructure-as-code, CI/CD pipeline management
Policy Workbench	Portal UI. Describe your policy in plain English, click Generate Policy, test it in the playground, deploy. No DSL knowledge needed.	Non-technical teams, rapid policy drafting

Agent Identity

AgentID: AI Agent Identity and Fingerprinting

AgentID gives each AI agent a verifiable cryptographic identity. When agents act autonomously, you need to know exactly which agent took which action, under what authority, and whether its behaviour has drifted from its defined profile.

AgentID

Verified Identity for Every Agent in Your Ecosystem

In multi-agent systems, agents call tools, spawn sub-agents, and make decisions autonomously. Without identity, there is no accountability. AgentID assigns a cryptographic fingerprint to each agent, tracks its behaviour over time, and alerts when it deviates from its baseline.

99.5%

ID Accuracy

<5ms

Verify Time

Drift Detection

96%

Spoofing Prevention

100%

Audit Coverage

100%

Multi-agent Support

Yes

🔐 Cryptographic Fingerprinting

Each agent receives a unique cryptographic identity on registration. Identity is bound to the agent's model, version, system prompt hash, and authorised tool set.

📊 Behavioural Baseline

AgentID learns the normal behaviour profile for each agent: typical tool call patterns, response latency distributions, and semantic consistency scores.

⚡ Drift Alerting

When an agent deviates from its baseline, such as calling tools it has not called before, or shifting its response semantics, AgentID raises an alert and can suspend the agent.

🔗 Chain of Custody

In multi-agent pipelines, every hand-off is logged with the sending and receiving agent identities. You have a complete, unforgeable audit trail of every agent action.

🛡️

Impersonation

Blocked

Cryptographic proof of identity

📡

Prompt Injection via Agent

Traced

Source agent identified

🔄

Goal Hijacking

Detected

Drift from authorised objectives

🔑

Privilege Escalation

Blocked

Tool scope enforced at identity level

📋

Audit Gap

Eliminated

100% action attribution

⚠️

Model Substitution

Caught

Version hash mismatch triggers alert

Use Cases

Use Cases by Industry

Financial Services

Regulatory Compliance and Fraud

PII protection for customer financial data. Accuracy evaluation for AI-generated financial advice. SQL injection prevention for database agents. Audit trail evidence via policy telemetry for regulatory reporting.

Healthcare

PHI Protection and Clinical AI

PHI detection and redaction in clinical chatbots. Hallucination detection for clinical decision support tools. Bias detection in diagnostic AI. HIPAA-aligned audit logging through policy engine telemetry.

Enterprise AI

Internal Agent Guardrails

System prompt protection across internal agent deployments. Tool call controls for agents with file, database, and network access. RAG pipeline integrity checks across enterprise knowledge bases.

Customer Support

Public-Facing Chatbot Safety

Toxic content filtering for customer-facing chatbots. PII protection for customer data submitted through conversation. Guardrails for automated response and escalation systems.

Technical Reference

Technical Specifications

SDK package	`pip install mirror-sdk`
Python version	3.9 and above
API endpoint	`https://mirrorapi.azure-api.net/v1`
Authentication	`MIRROR_API_KEY` environment variable
Response time SLA	Sub-200ms for real-time applications
Uptime SLA	99.9% with global edge deployment
Scaling	Horizontal scaling; intelligent caching for throughput optimisation
Telemetry	Optional: `MIRROR_TELEMETRY_ENABLED`, `MIRROR_TELEMETRY_API_KEY`
Compliance support	GDPR, HIPAA, and expanding framework support
Integration methods	Python decorator, programmatic API, Policy Workbench (GUI)
Pre-built policies	12 ready-made policies across input, output, tool, and RAG protection

Platform

How AgentIQ Fits with Other Mirror Products

AgentIQ

Runtime Behaviour Layer

This product. Monitors semantic content and enforces policies at the model boundary in real time.

VectaX

Data Encryption Layer

Encrypts the underlying data. AgentIQ monitors the content layer on top of the encryption VectaX provides. The two products are designed to be deployed together.

DiscoveR

Security Testing Layer

Automated red teaming that finds weaknesses in the same applications AgentIQ protects at runtime. Use DiscoveR to validate your AgentIQ policies are effective.

AgentIQ.