What is verifiable inference and how does it close the inference gap?

Verifiable inference means the inference process provides cryptographic proof that data was not exposed in plaintext during computation. There are two practical approaches. Confidential computing uses hardware-isolated memory regions (enclaves) and remote attestation to prove that inference ran in a measured environment where the operator could not see the data. Fully homomorphic encryption keeps data encrypted even during computation: the model operates on ciphertext and produces ciphertext, so plaintext never appears even under host compromise. VectaX implements the FHE approach for vector database operations, keeping embeddings encrypted throughout storage and retrieval.

What is AgentID and how does it implement zero trust for AI agents?

AgentID is Mirror Security's identity and access management layer for autonomous agents. It uses a two-plane architecture: an Identity Broker handles identity verification, policy evaluation, token issuance, introspection, revocation, and audit; a Resource Gateway handles runtime validation, policy checks, rate limiting, and connector enforcement before downstream calls. Tokens are short-lived (minutes, not months), capability-scoped (containing explicit resource targets, allowed scopes, and constraints), and carry delegation context (which agent instance, under which delegated principal, under which tenant and policy). The gateway enforces authorization continuously at request time, not just at token issuance.

What are the four zero trust principles applied to AI systems?

Verify explicitly: check identity, context, and health at every request, not just at login or token issuance. For AI this means verifying which model version is running, which agent instance is acting, and whether the environment has been attested. Use least privilege: scope tokens and permissions to the minimum capability required for the specific task. For AI agents this means capability-scoped tokens with explicit constraints, not broad service account permissions. Assume breach: design for a world where components are compromised. Short-lived credentials, compartmentalized blast radius, and fast revocation reduce the damage any single compromise can cause. Verify continuously: enforce policy at request time with runtime context, not just at authentication. For AI systems this means inline guardrails that check every model output and every agent action.

How do VectaX, AgentIQ, DiscoveR, and AgentID each address a different zero trust plane?

The four Mirror Security products map to four distinct zero trust planes for AI. VectaX addresses the data plane: it encrypts vector embeddings using similarity-preserving FHE so inference never exposes plaintext data. AgentIQ addresses the runtime plane: it enforces inline guardrails on model outputs including PII detection, hallucination scoring, bias and toxicity analysis, and prompt injection defence. DiscoveR addresses the model plane: it continuously runs adversarial tests to validate that model behaviour stays within expected boundaries before and after deployment. AgentID addresses the identity plane: it issues capability-scoped tokens, tracks delegation lineage, and enforces access policies at the gateway for every agent action.

Zero Trust Architecture for AI Systems | Track 3E

Q: What is the inference gap in AI zero trust?

The inference gap is the security exposure that occurs during AI model inference. Most AI deployments correctly encrypt data at rest and in transit. But at inference time, data is typically decrypted into memory so the model can process it. Once data exists as plaintext in RAM, it is reachable by privileged insiders, compromised hosts, memory scraping malware, crash dumps, misconfigured telemetry, and side-channel attacks. Even an architecture that meets all standard compliance requirements still has this plaintext window during inference. Verifiable inference through confidential computing or fully homomorphic encryption closes this gap.

Q: Why do AI agents need different identity management from traditional IAM?

Traditional IAM is designed for humans and stable services that have fixed roles and long-lived credentials. AI agents have different characteristics: they are often short-lived and task-scoped, they execute delegated actions on behalf of users or workflows, they may spawn hierarchies of child agents, and they can perform high-volume API activity at machine speed. Mapping these workloads to shared service accounts loses two core security properties: precise identity at the workload instance level, and constrained capability at the operation level. One compromised shared credential can create a disproportionate blast radius when identity and access boundaries are not workload-specific.

Section 01

Why traditional zero trust falls short for AI

Zero trust is the right security model for modern infrastructure. It replaced the assumption that anything inside the network perimeter is safe with continuous verification of every request. For human users and stable microservices, it works well.

AI systems break the two key assumptions zero trust was built on. First, traditional ZT was designed for human identities: users who have fixed roles, stable attributes, and audit trails. AI agents are short-lived, task-scoped, spawn sub-agents dynamically, and operate at machine speed. The identity model does not fit. Second, traditional ZT treats computation as a black box: it verifies who can access data, but it does not verify what happens to data during computation. With AI, computation is where the security risk lives.

The result: organisations can have a fully compliant zero trust posture for their user access layer while simultaneously running AI inference that exposes sensitive data in plaintext to any privileged insider, and deploying AI agents that operate under shared credentials with far more access than any individual task requires.

Traditional zero trust: what it covers

Verify user identity before granting network access

Least-privilege access to databases and APIs

Encrypt data at rest and in transit

Micro-segmentation of network traffic

Does NOT verify what happens during computation

Does NOT handle short-lived, task-scoped agent identities

Does NOT track delegation across agent hierarchies

Does NOT enforce capability constraints per AI operation

AI-aware zero trust: what you also need

Verifiable inference: proof that data was not exposed during computation

Short-lived capability-scoped tokens per agent task, not shared credentials

Delegation lineage: which agent, under which principal, for which tenant

Runtime enforcement of policy at every request, not just at token issuance

Continuous adversarial testing of model behaviour between deployments

Inline guardrails on model outputs at inference time

Fast revocation: seconds, not hours, when an agent or token is compromised

Audit trails that link machine actions to agent instance, principal, and policy

Section 02

The inference gap

Most security-conscious AI deployments correctly implement encryption at rest and encryption in transit. Data on disk is encrypted. Data on the wire is encrypted. This is necessary. It is not sufficient.

At inference time, data must be decrypted. The model reads plaintext inputs. The context window contains plaintext documents. The embeddings are computed from plaintext text. For the duration of inference, this data exists as plaintext in memory. That plaintext is reachable by anyone or anything with access to the compute environment.

The inference gap in a standard AI pipeline

📜

Data at rest

AES-256

👥

Transit to model

TLS 1.3

😀

Inference (plaintext in RAM)

⚠ Exposed

👥

Response to client

TLS 1.3

📜

Stored output

AES-256

During inference: privileged insiders, compromised hosts, memory-scraping malware, crash dumps, misconfigured telemetry, and side-channel attacks can all reach plaintext data even in a fully compliant deployment.

This is not a hypothetical risk. Memory scraping malware specifically targets AI inference servers because they process high-value plaintext at predictable intervals. Crash dumps from inference servers have been found to contain sensitive patient data and financial records in healthcare and banking deployments. Misconfigured logging tools have captured plaintext context windows in plain text log files accessible to developers with no data access clearance.

The inference gap exists in every standard AI deployment. It is not a bug in any specific product. It is a structural consequence of how inference works: the model needs to see the data to process it. Closing the gap requires either encrypting what the model sees, or verifying that the environment is trustworthy even if the data is plaintext.

Compliance does not close the inference gap. A deployment can pass SOC 2, HIPAA, and GDPR audits while still exposing sensitive data in plaintext during inference. Access controls, encryption at rest, and TLS in transit are all audit-checkable. Plaintext exposure during inference is not directly audited by any of these frameworks. This is an emerging gap in AI-specific compliance requirements.

Section 03

Verifiable inference

Verifiable inference means the inference process provides a technical guarantee that data was not exposed in plaintext during computation. Two practical approaches exist. They differ in their trust assumptions and performance characteristics.

Confidential computing (enclaves)

Hardware-based, near-plaintext performance

Inference runs inside a hardware-isolated memory region (SGX, AMD SEV, ARM CCA)

Remote attestation proves to the data owner that inference ran in the expected measured environment

Operator cannot see workload data even with physical access to the server

Near-plaintext performance: overhead is typically 5 to 20 percent

Trust assumption: trusts the hardware boundary (CPU vendor) but not the infrastructure operator

Hardware vulnerabilities (Spectre, Meltdown variants) can undermine guarantees

Best for: high-throughput inference where FHE overhead is unacceptable

Fully homomorphic encryption (FHE)

Cryptographic, no hardware trust needed

Data stays encrypted even during computation: model operates on ciphertext

Plaintext never appears in memory, even under host compromise

No trust in the infrastructure operator required at all

Strongest cryptographic control: guaranteed by mathematics not hardware

Performance overhead: VectaX achieves 150-240 tok/s on 7B models vs 200-300 plaintext

No hardware-level side-channel attack surface

Best for: regulated inference on highly sensitive data where no operator trust is acceptable

VectaX implements the FHE path for vector database operations. Embeddings are encrypted using similarity-preserving FHE before reaching the vector store. Similarity search runs on encrypted vectors without decryption. Retrieved results are decrypted only at the authorised client. An attacker with access to the vector database cannot reconstruct document content from the stored ciphertexts. The data plane gap is closed at the retrieval layer.

Section 04

Sovereignty and the inference gap

In regulated industries, AI sovereignty is often treated as a geography question: keep data inside national borders, use domestic providers, host in local data centers. That approach helps. It does not produce real sovereignty.

The moment inference begins, the question stops being where the data sits and becomes who can see it while computation happens. A hospital can keep data inside EU borders and meet GDPR residency requirements. But if patient scans appear as plaintext during inference, a breach exposes the crown jewels regardless of which country the server is in. Geography does not protect plaintext.

Real sovereignty requires technical guarantees at compute time. Controls that hold even when systems fail, operators are compromised, or infrastructure is breached. Residency requirements, access controls, and audit trails are all necessary. None of them change what happens when inference requires plaintext in memory.

Access controls reduce who can reach a system. They do not eliminate the fact that the system processes plaintext. Audit trails tell you what happened after an incident. They do not prevent exposure during inference. If your sovereignty story ends at "where the server is," it ends right before the part that matters.

Mirror Security on sovereignty: "Sovereignty isn't a location claim. It's control, verifiable control, over what happens to your data when intelligence runs." Two technical approaches close the gap: confidential computing (enclaves and remote attestation for near-plaintext performance) and FHE (computation on ciphertext, no trust in environment needed). Same goal, different trust assumptions.

📋 Mirror Blog · Sovereignty Without Verifiable Inference Is a Mirage

Section 05

Four zero trust principles for AI

The four core zero trust principles apply to AI systems, but each one takes on an AI-specific meaning that goes beyond the traditional interpretation. Here is what each principle means when AI models and autonomous agents are in scope.

🔍

Verify explicitly

Classic: check identity and device health before granting access

For AI: verify which model version is running, which agent instance is acting, which delegation chain it is operating under, and whether the environment has been attested at every request. A valid user token does not mean the downstream agent is safe or behaving correctly.

🔓

Use least privilege

Classic: grant minimum permissions needed for the role

For AI: issue capability-scoped tokens per task, not broad service-account permissions. A customer support agent should have a token that can only issue refunds up to the original transaction value, not read all payment records. Every agent action gets the minimum capability for that specific operation, not a shared credential with broad access.

💥

Assume breach

Classic: design for a world where the perimeter is already breached

For AI: design for a world where any agent instance can be compromised. Short-lived tokens (minutes, not months) limit the blast radius of a compromise. Compartmentalised access means one compromised agent cannot pivot to other systems. Fast revocation (seconds) ensures a compromised credential window is narrow. Every agent action is audited with lineage, so forensics is possible.

🔄

Verify continuously

Classic: re-authenticate periodically, check device posture

For AI: enforce policy at every model output and every agent action, not just at token issuance. AgentIQ runs inline guardrails on every response: PII in outputs, hallucination, prompt injection, toxicity. DiscoveR runs continuous adversarial testing between and during deployments. A model that was safe at deployment may not be safe after fine-tuning or after a new attack technique is discovered.

Section 06

The agent identity crisis

AI agents are now operating across payment systems, ticketing systems, internal data stores, and SaaS control planes. But in too many deployments, access control still relies on shared long-lived secrets designed for static services. This pattern is an active operational risk, not a future concern.

Traditional IAM works well for humans and stable services. AI agents have fundamentally different characteristics that break the IAM model.

Shared credential model vs capability-scoped model

⚠ Shared service account (current state for most)

Identity: One service account shared across all agent instances

Lifetime: Static secret, lasts months or years

Scope: Broad API access (read, write, modify, delete across the service)

On compromise: Attacker has full service access indefinitely

Revocation: Rotating the credential breaks all agents using it simultaneously

Attribution: "The payment service agent" did it. No further granularity.

Delegation: Not tracked. No lineage across agent hierarchies.

✓ Zero trust agent identity (target state)

Identity: Unique identity per agent instance, per task, per tenant

Lifetime: Short-lived token, expires in minutes

Scope: Scoped to one operation (payments:refund for customer X, amount limit $50)

On compromise: Attacker holds a token expiring in minutes, for one bounded operation

Revocation: Revoke one token without affecting other agents

Attribution: Agent-73-instance-4, delegated by user Alice, under Tenant-B policy

Delegation: Full lineage tracked across parent and child agent hierarchy

The failure mode from shared credentials is predictable. Broad access where narrow access is needed. Weak attribution when incidents happen. Slow revocation during active response. Difficult compliance evidence for delegated machine actions. One compromise creates a disproportionate blast radius when identity and access boundaries are not workload-specific.

📋 Mirror Blog · Zero Trust for AI Agents: Solving Identity and Access with AgentID

Section 07

AgentID architecture

AgentID from Mirror Security applies zero trust principles to autonomous agent systems. It uses a two-plane architecture that separates token management from runtime enforcement. This separation means you can add runtime enforcement to existing infrastructure without replacing your authentication stack, and you can change enforcement policies without changing how tokens are issued.

AgentID two-plane architecture

Identity Broker

Identity verification: authenticate the requesting agent instance

Policy evaluation: evaluate tenant and role-based policy for the request

Token issuance: mint short-lived capability-scoped tokens

Token introspection: validate active tokens on demand

Revocation: invalidate specific tokens or all tokens for an agent

Audit: log all token issuance and revocation events with full context

Resource Gateway

Runtime token validation: check token activity and revocation status before every call

Scope enforcement: verify the requested operation is within token scope

Constraint checks: validate per-connector constraints (amount limits, field restrictions)

Rate limiting: enforce per-token and per-agent rate controls

Policy decisions: runtime policy re-evaluation using current request context

Connector integration: forward validated requests to downstream systems

This model turns authorization into continuous verification at request time, not one-time trust at token issuance. A token that was valid when issued can be revoked before the next request reaches the gateway.

The Resource Gateway enforces at request time what the Identity Broker authorized at token issuance. If an agent's scope token says it can issue a refund up to $50, the gateway checks that constraint against the actual refund amount in the request before forwarding to the payments API. The agent cannot bypass this check by sending a request directly to the API, because the API is not accessible except through the gateway.

Connector-based integrations support common enterprise systems: payments, ticketing, collaboration, source control, secrets management, cloud identity, and data access surfaces. Each connector defines the constraint schema that the gateway can evaluate for requests targeting that system.

Section 08

Capability token design

A capability token for an AI agent is not a standard OAuth access token. It carries more context than a scope string. The additional fields are what make zero trust enforcement possible at the gateway: without resource target, constraint, and delegation context, the gateway cannot make a meaningful policy decision.

Anatomy of an AgentID capability token

agent_instance_id

Unique identifier for this specific agent instance. Not a shared service ID. Links every action to one workload for audit and forensics.

delegated_principal

The human user or system whose context this agent is acting under. Supports multi-hop delegation: agent A delegated by user Alice, spawning agent B. Full lineage tracked.

tenant_id

Which tenant or organisation context the action is scoped to. Prevents cross-tenant access even when agents share infrastructure.

capability

Specific operation permitted. Example: "payments:refund" not "payments:*". One token per task, not one token per service.

resource_target

The specific resource this capability applies to. Example: customer_id "cust_8821", not all customers. Prevents lateral movement to other records.

constraints

Per-operation limits enforced at the gateway. Example: {"max_amount": 47.50, "currency": "EUR"}. Evaluated against the actual request parameters before forwarding.

policy_id

Which policy rule authorized this token. Links the token to the policy version at issuance time. Enables audit of which policy permitted which action.

exp (expiry)

Token lifetime in seconds. Typically 300 to 900 seconds (5 to 15 minutes). After expiry, gateway rejects the token even if it passes all other checks.

Recommended adoption path for AgentID. Start with capability token issuance and introspection. Add runtime enforcement through the gateway. Enable federated identity for delegated enterprise principals. Adopt spawn-time agent lifecycle and delegation lineage controls. Expand connector constraints for high-risk workflows. This phased path delivers immediate security improvement while reducing deployment risk at each step.

Section 09

SPIFFE, SPIRE, and AgentID

A common question when implementing agent zero trust: do we need AgentID if we already use SPIFFE/SPIRE? The answer is that they solve different problems. Both are necessary in a complete zero trust implementation for AI agents.

SPIFFE / SPIRE

Identity attestation layer

Answers: which workload is calling? (verifiable identity document)

Provides cryptographic proof of service identity via X.509 SVIDs

Works at the infrastructure level: services, pods, VMs

Does NOT answer: which capability is the agent allowed to use?

Does NOT constrain which customer's data can be accessed

Does NOT track delegation across agent parent-child hierarchies

Does NOT enforce per-operation constraints at a gateway

AgentID

Capability and policy enforcement layer

Sits above the attestation layer. Can consume SPIFFE SVIDs as identity inputs.

Adds capability scoping: which operation, on which resource, with which constraints

Adds delegation context: agent instance, delegated principal, tenant, policy

Runtime gateway enforcement at every request, not just identity verification

Revocation at the capability level, not just the identity level

Together: SPIFFE proves "who is calling", AgentID proves "what they are allowed to do"

Cloud workload identity and AgentID. Cloud providers (AWS, GCP, Azure) offer workload identity federation that solves "which service is calling?" But none of them answer "which agent instance, acting under which delegated user, with which scoped capability, under which tenant policy?" Cloud workload identity is an input to AgentID, not a replacement for it. AgentID uses it as an attestation source, then layers capability scoping, policy evaluation, delegation tracking, and gateway enforcement on top.

Section 10

Worked example: the refund agent

From the Mirror Security blog on zero trust for AI agents: consider a customer support agent that needs to issue a refund. This is a common agentic workflow that illustrates exactly how the shared credential model fails and how zero trust capability tokens fix it.

Without AgentID: the agent authenticates with a shared service account that has broad access to the payments API: read transactions, issue refunds, modify subscriptions, update billing details. A compromised runtime can reuse that credential for any of those operations, indefinitely.

With AgentID: the flow looks different.

1

Agent requests a capability token from the Broker

The agent identifies itself (SPIFFE SVID or equivalent) and requests the specific capability it needs for this task.

capability: "payments:refund", resource: "customer_id:cust_7821", constraint: "max_amount:47.50"

Token request

2

Broker verifies identity and evaluates tenant policy

Policy check: "Support agents may issue refunds but may not modify subscriptions." Agent identity verified. Requested capability is within policy.

policy_result: PERMIT, token_ttl: 300s, constraint_sealed: true

Policy evaluation

3

Broker mints a short-lived capability token

Token expires in 5 minutes. Scoped to payments:refund for one customer only. Amount constraint sealed into the token so the gateway can enforce it.

token: eyJ..., exp: now+300s, scope: "payments:refund:cust_7821"

Token issued

4

Agent presents token to the Resource Gateway

Gateway validates token: active, not revoked. Checks scope matches requested operation. Checks refund amount against the max_amount constraint in the token.

gateway_check: token_valid=true, scope_match=true, amount=47.50 <= max_amount=47.50

Gateway enforces

5

Gateway forwards validated request to payments API

The payment is processed. The refund is issued. Token expires immediately after or after 5 minutes, whichever comes first.

Action executed

6

Audit trail created with full lineage

Audit record links the refund to: agent instance ID, delegated principal (customer support user), tenant policy ID, token ID, amount, customer, timestamp. Full forensic chain available.

audit: {agent_id, principal, policy_id, capability, resource, amount, timestamp}

Audited

If the agent runtime is compromised during step 4, the attacker holds a token that can only issue one bounded refund for one customer, expiring in minutes. Compare that with the shared credential that can modify any billing record, for any customer, indefinitely. The blast radius difference is the practical value of zero trust for agents.

Section 11

The four AI trust planes

A complete zero trust architecture for AI covers four distinct planes. Each addresses a different category of risk. No single product covers all four. Mirror Security's platform is designed so that each product addresses one plane, and the four planes together cover the full attack surface of a production AI deployment.

🔒

Data plane

Encrypted inference and vector search

The data that flows through the AI pipeline must never be exposed in plaintext on the server side. Embeddings are encrypted before storage. Similarity search runs on encrypted vectors. Retrieved documents are decrypted only at the authorised client. Closes the inference gap at the retrieval layer.

VectaX · Similarity-preserving FHE for vector databases

🎯

Model plane

Continuous adversarial validation

The model itself must be continuously tested for known attack patterns. A model safe at deployment may not be safe after fine-tuning, after new prompt injection techniques are discovered, or after the threat landscape shifts. Continuous red teaming validates model behaviour against the current attack surface.

DiscoveR · Automated AI red teaming and attack surface testing

⚙

Runtime plane

Inline guardrails on every output

Every model output must be checked before it reaches the user or triggers downstream actions. PII in outputs, hallucinated facts, toxic content, and prompt injection attempts are all runtime events that continuous verification must catch. Guardrails run inline: they do not block requests but they flag, filter, or block based on policy.

AgentIQ · Runtime guardrails for PII, hallucination, injection, toxicity

👤

Identity plane

Capability-scoped agent access control

Every agent action must be authorized by a capability-scoped token with delegation lineage and runtime enforcement. Short-lived tokens, workload-specific identities, constraint validation at the gateway, and fast revocation are all required to prevent one compromised agent from becoming a broad breach.

AgentID · Zero trust identity and access for autonomous agents

From the Mirror Security blog: "Most organizations today address these concerns piecemeal: a guardrail vendor here, an encryption tool there, identity bolted on as an afterthought. The gaps between those point solutions are where breaches happen. Mirror's platform closes those gaps by design." A production AI system that closes only two of the four planes has an incomplete zero trust posture. Adversaries find and exploit the open planes.

Section 12

DiscoveR on the model plane

Zero trust applied to models means you never assume a model is safe because it passed evaluation at deployment. The threat landscape changes. New attack techniques for prompt injection, jailbreaks, and data extraction appear continuously. A model that was secure last month may be vulnerable today because an attacker has published a new technique that your static evaluation set did not cover.

DiscoveR continuously runs adversarial tests against your deployed models and AI applications. It registers your application endpoint, selects attack categories relevant to your system type and threat model, and runs structured probes. Results include per-category pass rates, severity scores, and a correlation ID that links scans across remediation cycles so you can measure whether a fix held.

In a zero trust context, DiscoveR operationalises the "verify continuously" principle at the model plane. It is not a one-time pentest. It runs on a schedule or trigger, covers the current attack surface, and produces structured findings that feed into your security monitoring workflow.

Section 13

AgentIQ on the runtime plane

Zero trust applied to the runtime plane means every model output is checked before it acts in the world. A customer support agent that outputs a hallucinated refund amount, leaks PII from a retrieved document, or has been redirected by a prompt injection in a customer message are all runtime security failures. They cannot be caught by a static scan because they depend on what the model sees and produces at inference time.

AgentIQ provides runtime guardrails that run inline, applied to model outputs before they reach downstream systems. The guardrail categories map directly to the zero trust "verify continuously" principle. PII detection catches sensitive data that should not appear in outputs or agent actions. Hallucination scoring validates whether factual claims have support in the context. Prompt injection defence detects attempts to redirect the agent away from its intended behaviour. Chain security validation checks the integrity of multi-step agent workflows.

In a zero trust context, AgentIQ is the enforcement point for the runtime plane. It does not inspect model weights or training data. It operates on the live stream of inputs and outputs at the moment of inference, making trust decisions per response rather than per deployment.

Section 14

Frequently asked questions

What is the inference gap and why does it matter for zero trust?

The inference gap is the security exposure that occurs during AI model inference. Most deployments correctly encrypt data at rest and in transit. But at inference time, data is typically decrypted into memory so the model can process it. Once data exists as plaintext in RAM, it is reachable by privileged insiders, compromised hosts, memory scraping malware, crash dumps, misconfigured telemetry, and side-channel attacks. This gap exists even in architectures that pass all standard compliance audits. Verifiable inference through confidential computing or FHE closes this gap by ensuring plaintext never appears in the compute environment.

Why do AI agents need different identity management from traditional IAM?

AI agents are short-lived, task-scoped, may spawn child agents, and operate at machine speed. Mapping them to shared service accounts designed for static services loses two core security properties: precise identity at the workload instance level, and constrained capability at the operation level. One compromised shared credential creates a disproportionate blast radius when identity and access boundaries are not workload-specific. Zero trust for agents requires short-lived capability-scoped tokens, delegation lineage tracking, and runtime enforcement at every request.

What is AgentID and how does it work?

AgentID is Mirror Security's zero trust identity and access management layer for autonomous agents. It uses two planes: an Identity Broker that handles token issuance, policy evaluation, revocation, and audit; and a Resource Gateway that validates tokens, enforces capability scope, checks constraints, and applies rate limits at every request. Tokens are short-lived (minutes), capability-scoped (specific operation on specific resource with constraints), and carry delegation context (agent instance, delegated principal, tenant, policy). This turns authorization into continuous enforcement at request time rather than one-time trust at token issuance.

How do the four Mirror Security products map to zero trust planes?

VectaX addresses the data plane: similarity-preserving FHE keeps vector embeddings encrypted throughout storage and retrieval, closing the inference gap at the retrieval layer. DiscoveR addresses the model plane: continuous adversarial testing validates that model behaviour stays within expected boundaries as the threat landscape evolves. AgentIQ addresses the runtime plane: inline guardrails check every model output for PII, hallucination, prompt injection, and toxicity. AgentID addresses the identity plane: capability-scoped tokens, delegation lineage, and gateway enforcement control what each agent is allowed to do at every request.

Does geographic data residency provide sovereignty for AI inference?

No. Geographic residency requirements (keeping data within national borders) address jurisdiction and procurement compliance. They do not change what happens when inference requires plaintext in memory. A hospital that keeps patient data within EU borders but runs standard inference still exposes patient data as plaintext during computation. Real sovereignty requires technical guarantees at compute time: either confidential computing with remote attestation, or FHE which eliminates the need to trust the infrastructure operator at all. Residency is necessary but not sufficient for sovereign AI inference.

We already use SPIFFE/SPIRE. Do we still need AgentID?

SPIFFE and SPIRE solve a different problem. They answer "which workload is calling?" with a cryptographically verifiable identity document. They do not answer "which specific capability is the agent allowed to use, on which specific resource, with which constraints, under which delegated principal and tenant policy?" AgentID sits above the attestation layer. It can consume SPIFFE SVIDs as identity inputs, then layer capability scoping, policy evaluation, delegation context, and gateway enforcement on top. You need both: SPIFFE proves identity, AgentID proves authorized capability.

Zero Trust Architecture for AI Systems