Module 4 of 6 · Vector DB & RAG Security · Core Security Path

Access Control

Vector Store Access
Control & Data
Governance

Multi-tenancy isolation, RBAC at query time, metadata filtering failures, identity management for AI agents, and data governance for production RAG systems.

20 min read
Core Security
Intermediate

Module Progress

1 2 3 4 5 6

Section 01 · The Problem

Why access control fails in vector databases

Access control in vector databases fails for a specific reason that does not apply to traditional databases: the access decision and the data access happen at different places. In a relational database, the query engine enforces access control as part of executing the query. You cannot get data back that you are not authorised to see because the database checks before returning anything.

In most vector database deployments, access control is applied in the application layer, not the database layer. The application adds a namespace filter or metadata condition to the query before sending it. The vector database itself has no concept of which user is making the request or what they are allowed to see. It returns results based purely on similarity, and the application is supposed to filter them.

This creates a gap. Any bug, injection attack, or misconfiguration in the application layer removes all access control, because the database will happily return results from any namespace to any caller. The Cisco 2024 white paper identifies this as the root cause of most unauthorised access incidents in vector database deployments.

The fix has two parts. First, enforce isolation at the database layer through namespaces, collections, or indexes with dedicated credentials. Second, add cryptographic enforcement through VectaX so that even if the application layer is bypassed, the user cannot decrypt vectors outside their policy scope.

The core problem stated plainly: Most vector databases are secure if the application behaves correctly. They are not secure if the application has a bug. Production security requires controls that hold even when the application fails. That means database-layer isolation and cryptographic enforcement, not just application-layer filtering.

Section 02 · Isolation Models

Multi-tenancy isolation: three models, three risk profiles

How you separate tenant data in a shared vector database directly determines what happens when access control breaks. There are three practical models, each with a different security strength and cost profile.

🗂
Namespace / Metadata Filter
All tenants share one index. Each vector is tagged with a tenant ID in metadata. Queries are filtered by tenant ID at the application layer. Cheapest to operate. Most commonly deployed.
Failure mode: Any application bug that drops the filter exposes all tenant data. The database has no way to enforce the boundary.
Weakest isolation
📦
Collection / Namespace Per Tenant
Each tenant gets a dedicated collection (Qdrant) or namespace (Pinecone). Separate API keys per collection. The database enforces that a key for collection A cannot read collection B. Application still manages routing but database provides a second enforcement layer.
Failure mode: A leaked collection-level key exposes one tenant. A misconfigured API key with broad permissions exposes multiple tenants.
Balanced
🔐
Index Per Tenant
Each tenant has a completely separate vector index with dedicated infrastructure. Physical separation. Used in healthcare, financial services, and government where compliance requires data segregation between clients or customers.
Trade-off: Highest cost. Each index consumes dedicated compute and storage. Appropriate when data sensitivity and compliance require it.
Strongest isolation

The namespace-only trap: Most RAG tutorials and quickstarts use namespace metadata filtering because it is the simplest to implement. Teams ship this to production without realising the isolation is entirely dependent on application-layer correctness. A Pinecone namespace is a logical partition within a shared index. It is not a security boundary enforced by the database. Never use namespace filtering as your only isolation mechanism for sensitive data.

Section 03 · Architecture

Pre-filtering vs post-filtering: which RBAC architecture to use

When you add RBAC to vector search, there are two architectural choices for where the access check happens relative to the similarity search itself. This decision affects both security and performance.

Pre-filtering (recommended)
1
Apply access policy to restrict the search space to only vectors the user is authorised to see
2
Run similarity search only across the authorised subset
3
Return top-K results from the authorised set
No unauthorised vectors are ever touched. Compute is not wasted on data the user cannot see. Recall is accurate.
Post-filtering (avoid for RBAC)
1
Run similarity search across the entire index, including vectors the user cannot access
2
Remove results that fail the access check from the response
3
Return whatever authorised results remain, which may be far fewer than top-K
Wastes compute on unauthorised data. When access is selective (user can only see a small fraction), results collapse. The HoneyBee paper (2025) shows post-filtering suffers severe recall degradation at high selectivity.

VectaX implements pre-filtering through cryptographic means. Each vector is encrypted with an access policy. A user's decryption key only works on vectors that match their policy. The similarity search still runs across all vectors in the index (giving full recall), but only vectors within the user's policy scope can be decrypted. An attacker who retrieves an encrypted vector outside their policy scope gets ciphertext they cannot use.

This is better than pure pre-filtering because it does not require maintaining separate indexes per access level. The full index is searched for maximum recall, but access is enforced at the decryption layer, not the search layer.

Section 04 · VectaX RBAC

RBAC with VectaX: policies at role, group, and department level

Standard RBAC assigns permissions to roles such as reader, writer, or admin. This works for coarse-grained access but breaks down in enterprise AI systems where a single user may belong to multiple teams, have different access levels in different departments, and need context-aware permissions that standard roles cannot express.

VectaX implements multi-dimensional RBAC through its RBACVectorData class. Access policies attach to each vector at encryption time across three independent dimensions simultaneously: roles, groups, and departments. A user can only decrypt a vector if their key satisfies all three dimensions of the policy attached to that vector.

Role taxonomy for vector databases

Role Insert vectors Query / retrieve Manage indexes Delete data Admin functions Typical users
Admin Yes Yes Yes Yes Yes Platform owners, security team
Data Scientist Yes Yes Limited No No ML engineers, researchers
Analyst No Yes No No No Business analysts, reporting
Ingestion Service Yes No No No No Automated pipeline services
Retrieval Service No Yes No No No RAG query services, chatbots
AI Agent Scoped Scoped No No No Autonomous agents with defined tasks

VectaX RBAC implementation

Python · VectaX multi-dimensional RBAC (from Qdrant integration docs)

from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import RBACVectorData

# 1. Define the application-level policy (roles available in the system)
app_policy = {
    "roles": ["admin", "hr_manager", "analyst"],
    "groups": ["team_hr", "team_finance", "team_legal"],
    "departments": ["human_resources", "finance", "legal"],
}
sdk.set_policy(app_policy)

# 2. Encrypt a vector with a specific access policy attached
# Only users with hr_manager role in team_hr in human_resources dept can access
vector_data = RBACVectorData(
    vector=embedding,
    id="salary_review_q3_2026",
    access_policy={
        "roles": ["hr_manager", "admin"],
        "groups": ["team_hr"],
        "departments": ["human_resources"],
    }
)
encrypted = sdk.rbac.encrypt(vector_data)

# 3. Generate a per-user decryption key scoped to their actual roles
hr_manager_key = sdk.rbac.generate_user_secret_key({
    "roles": ["hr_manager"],
    "groups": ["team_hr"],
    "departments": ["human_resources"]
})
# analyst_key would NOT be able to decrypt salary_review_q3_2026
analyst_key = sdk.rbac.generate_user_secret_key({
    "roles": ["analyst"],
    "groups": ["team_finance"],
    "departments": ["finance"]
})

Try it live · VectaX Playground

Test multi-dimensional RBAC policies on encrypted vectors in real time

Why three dimensions matter: A user with role analyst in group team_finance and department finance can access finance documents but not HR salary data, even though both are in the same vector index. Their decryption key is scoped to all three dimensions. The HR salary vector requires human_resources department. The key mismatch means the decryption fails cryptographically, not at the application layer. No application bug can bypass this.

Section 05 · ABAC

ABAC: access control for systems that cannot plan ahead

RBAC works well when access patterns are predictable. You know who the HR managers are, what documents they need, and the roles change slowly. But modern AI systems are dynamic. Agents run tasks on behalf of multiple users. Pipelines combine data from different departments. New use cases appear faster than role definitions can be updated.

Attribute-Based Access Control (ABAC) evaluates each access request against a set of attributes at decision time. Instead of asking "is this user an HR manager?", ABAC asks "does this request satisfy all required conditions?" The conditions can include time of day, network source, data classification level, request context, and user attributes simultaneously.

ABAC policy example: context-aware retrieval access

Condition set

request.source = internal_vpn user.clearance >= document.classification time.hour between 07:00 and 20:00 service.name in approved_services

All conditions met

Access granted. Documents returned matching the user's clearance level.

Any condition fails

Access denied. Request logged. Alert triggered if denial pattern is unusual.

ABAC adds complexity but is more appropriate for AI agent systems where the requesting identity is a service account acting on behalf of a human user. The agent's request attributes include both the agent's own identity and the delegated user's attributes. This lets you write policies like: "allow if the agent is an authorised service AND the user it is acting for has appropriate clearance AND the request comes from an approved execution environment."

VectaX's multi-dimensional policy system supports a hybrid of RBAC and ABAC. Roles, groups, and departments are RBAC dimensions. The access policy can be extended with additional attributes at policy definition time. AgentIQ from Mirror Security adds runtime policy enforcement that evaluates request context, which complements VectaX's per-vector access control with system-level ABAC.

Section 06 · Common Failure Mode

Metadata filtering failures: the access control that looks solid but is not

The most common access control pattern in RAG systems is to add a metadata filter to every query. The application knows the user's tenant ID, department, or clearance level, and adds a filter={"tenant_id": "acme_corp"} parameter to the similarity search query. This is easy to implement and works correctly in normal operation.

It breaks in at least four ways.

The fix: Metadata filtering is useful as a first-pass efficiency mechanism (reducing the search space) but must not be the only access control. Combine it with collection-level or index-level isolation (so direct API access hits a boundary) and VectaX cryptographic enforcement (so even a correct query returns undecryptable ciphertext to an unauthorised user).

Section 07 · Identity

Identity management: every request needs a traceable identity

Strong access control requires that every request to the vector database can be tied to a known, authenticated identity. Without this, access decisions are anonymous, audit logs are useless, and post-incident forensics cannot determine who accessed what or when.

In AI systems, the identities making requests are not just human users. There are four identity types, each with different security requirements.

👤
Human users
Developers, analysts, and administrators accessing the vector database through applications or direct API calls. Should authenticate via SSO with MFA. User secret keys from VectaX scoped to their role and department.
Well-understood. Standard IAM patterns apply.
Backend services
The ingestion pipeline, the retrieval service, and the LLM orchestration layer. Each should have its own service identity with scoped credentials. Never share credentials between services.
Risk: over-permissioned service accounts that accumulate access over time.
🕐
Scheduled jobs
Re-indexing jobs, data cleanup tasks, and batch processing pipelines. These run without human oversight and often have broad permissions set during initial setup that are never reviewed. Audit regularly.
Risk: stale permissions that outlive the original purpose of the job.
🤖
Autonomous AI agents
The highest-risk identity type. Agents make decisions and take actions without direct human supervision. They can be compromised through prompt injection, causing them to act outside their intended scope. Each agent needs its own identity, scoped strictly to what its task requires, with AgentIQ monitoring its runtime behaviour.
Risk: prompt injection + excessive permissions = highest blast radius in any RAG system.

Agent identity principle: An agent acting on behalf of user A should not have broader access than user A has. If user A can only retrieve documents from the HR namespace, the agent acting for user A should be constrained to the same namespace. This delegation scoping prevents privilege escalation through the agent layer.

Section 08 · Design Principle

Least privilege: scoping each component to exactly what it needs

Least privilege is the principle that each component in a system should have only the minimum permissions required for its specific function. In vector database deployments, this is frequently violated because teams assign broad permissions during development and never narrow them for production.

The Cisco 2024 white paper specifically identifies least privilege as one of the most effective ways to reduce risk in complex AI systems. Applied to a RAG pipeline, here is what it means in practice:

Least privilege permissions per RAG component

📥
Ingestion service
Write-only access to its designated namespace. Cannot read, query, or delete. Cannot access other namespaces.
Write only
🔍
Retrieval service
Read-only query access to authorised namespaces. Cannot insert, modify, or delete. Cannot access admin endpoints.
Read only
🔗
LLM orchestration layer
Receives retrieved results from the retrieval service. Does NOT have direct access to the vector database. Cannot run its own queries.
No direct DB access
🤖
AI agent
Scoped to the namespaces and operations its task requires. Permissions mirror the user it is acting for. Re-evaluated per task, not assigned permanently.
Task-scoped
🛠
Admin / maintenance
Full access but via MFA-protected accounts that are not used for operational tasks. Separate credentials from service accounts. All admin actions logged and reviewed.
MFA protected

Section 09 · Governance

Data governance: knowing what you have and who owns it

Data governance for vector databases is the set of policies, processes, and records that answer four questions: what data is in the system, who is responsible for it, who has access to it, and what happens to it over time. Without governance, access control reviews are impossible, compliance audits become guesswork, and incident response lacks the information needed to understand blast radius.

📋
Data inventory
A record of every collection, namespace, and index in your vector database: what documents it contains, what classification level applies, which team owns it, and when it was last audited. Without an inventory you cannot answer a regulator's question about what personal data your AI system processes.
🗂
SBOM and AI-BOM
A Software Bill of Materials (SBOM) tracks library versions and dependencies. An AI Bill of Materials (AI-BOM) extends this to embedding models, fine-tuned weights, training datasets, and retrieval configuration versions. When a CVE affects a library or a research paper reveals a vulnerability in an embedding model architecture, your AI-BOM tells you immediately whether you are affected.
📝
Access review process
A scheduled review (quarterly for most systems, monthly for high-sensitivity) that checks whether all service identities, user accounts, and agent identities still need the access they have. People change roles. Services are deprecated. Agents are updated. Permissions that were appropriate six months ago may no longer be necessary.
🔄
Data lifecycle and retention
Vector databases accumulate embeddings over time. Data that should have been deleted under GDPR's right to erasure or a data retention policy may still be present in the vector index as an embedding. Implement a data lifecycle process that tracks when each document was embedded, flags documents past their retention date, and triggers re-indexing when source documents change.

GDPR and the right to erasure: Under GDPR Article 17, individuals can request deletion of their personal data. If a document containing personal data about a user was embedded into your vector index, deleting the source document is not sufficient. You must also delete the corresponding embedding vectors and any cached responses that may contain information derived from that document. This requires knowing exactly which vectors were generated from which source documents, which means maintaining the ingestion audit trail from Module 3.

Section 10 · Operations

Rotation policies and version pinning

Long-lived credentials and unpinned library versions are two operational risks that compound over time. Both are simple to address and both are consistently neglected in fast-moving AI projects.

Credential / asset type Recommended rotation frequency Trigger for immediate rotation Automated?
Vector DB API keys (production) Every 30 to 60 days Key exposure, team member departure, breach Yes, where platform supports it
VectaX user secret keys On role change or quarterly User role changes, compromise suspected Via re-issuance on access review
Encryption master keys (VectaX) Annually with re-encryption Key compromise confirmed Requires re-embedding workflow
Service account credentials Every 90 days Service decommissioned, compromise suspected Yes via secrets manager
Embedding model API keys (OpenAI, Cohere) Every 60 days Key leaked, provider advisory Via secrets manager rotation
Embedding model weights (pinned version) Review on provider release, update after testing Security advisory, backdoor discovered No, requires re-embedding after version change

Version pinning for libraries: Pin the versions of LangChain, LlamaIndex, your vector database client, and the embedding model client in your requirements file. Unreviewed auto-updates have introduced breaking security regressions in orchestration libraries before. Test version updates in a staging environment against your RAG system's security controls before deploying to production. Keep a record of which library versions are running in each environment.

Section 11 · Visibility

Access monitoring: seeing what is actually happening

Access control without monitoring is security theatre. You have defined who should have access, but you have no way to verify that the rules are working, detect when they are being probed, or respond when they fail. Every production vector database deployment needs a monitoring layer that turns raw access events into actionable visibility.

Query volume per identity

Baseline each service identity's normal query rate. Sudden spikes may indicate adversarial probing or a pipeline malfunction. Plot per-identity volume over rolling 24-hour windows.

Alert on 3x normal rate

Failed authentication attempts

Repeated failures from the same source may indicate credential stuffing or key brute-force attempts. Log all authentication failures with source IP and identity.

Alert on 5+ failures per hour

Zero-result queries

A high volume of queries returning zero results can indicate adversarial probing of namespace boundaries or access controls. Legitimate users rarely make queries that return nothing repeatedly.

Investigate clusters of failures

Cross-namespace queries

Queries from identities that do not normally access a particular namespace. A finance service account querying the HR namespace should trigger immediate investigation.

Alert on first occurrence

Off-hours activity

Query or write activity outside normal business hours for identities that should only be active during those hours. Automated pipelines that run 24/7 should be baselined separately.

Alert on human accounts only

Credential rotation status

Track which keys are approaching their rotation date and which have exceeded it. Alert before expiry so rotation happens proactively rather than in response to a breach.

Dashboard metric

Where AgentIQ fits in monitoring: AgentIQ from Mirror Security provides runtime monitoring specifically for AI agent behaviour, including tool calls, retrieved content handling, and output classification. For agentic RAG systems where an agent has retrieve-and-act capabilities, monitoring the agent's behaviour at runtime catches misuse that vector database query logs alone would not surface. Learn about AgentIQ →

Next: Module 5 of 6

Encrypted Inference & Encrypted Vector Memory

How FHE enables inference on encrypted data, encrypted agent memory, PHE vs SHE vs FHE, VectaX encrypted inference in production, and compliance implications for regulated industries.