What is the difference between pre-filtering and post-filtering RBAC in vector databases?

Pre-filtering RBAC applies access control before the similarity search runs, restricting which vectors are searched based on the user's permissions. This ensures only authorised vectors are ever considered. Post-filtering runs the similarity search first across all vectors and then removes results the user cannot access. Post-filtering wastes compute on vectors the user cannot see and produces recall problems when the filter is selective (most results are removed). Pre-filtering is more secure and more efficient for high-selectivity access policies.

How does VectaX implement RBAC for vector databases?

VectaX implements RBAC through its RBACVectorData class, which attaches an access policy to each vector at encryption time. The policy specifies which roles, groups, and departments can access that vector. When a user queries the database, their user secret key (generated via mirror_sdk.rbac.generate_user_secret_key()) determines which encrypted vectors they can decrypt and retrieve. Access is enforced cryptographically: a user without the correct key cannot decrypt vectors outside their policy scope, even if they can query the index.

What is a metadata filtering failure in RAG systems?

A metadata filtering failure occurs when access control is implemented as a filter on metadata fields rather than enforced at the database or cryptographic level. If the filter is applied in application code before the query, any bug or injection that bypasses the application layer removes all access control. If the filter is applied as a query parameter, a user who can modify their queries can simply omit the filter. Metadata filtering as the sole access control mechanism is fragile. It must be combined with database-level namespace isolation or cryptographic enforcement.

Why do AI agents need special identity management?

AI agents act autonomously, making retrieval and tool-use decisions without direct human supervision. Without strong identity management, agent actions are anonymous: the audit log shows the agent accessed data, but not which user request triggered it or what policy should have applied. Agents also tend to accumulate permissions over time as new capabilities are added. Each agent should have its own service identity with access scoped only to the namespaces and operations it needs. When an agent is compromised through prompt injection, a scoped identity limits blast radius.

What is least privilege for RAG components?

Least privilege for RAG means each component gets only the minimum access needed for its specific function. The ingestion service gets write-only access to its designated namespace. The retrieval service gets read-only access. The LLM orchestration layer gets read-only access to retrieved results but cannot query the vector database directly. Admin functions are separated from operational functions. No single component has full read-write-admin access. This limits the damage if any one component is compromised through a vulnerability or injection attack.

What is an AI-BOM and why does it matter for governance?

An AI Bill of Materials (AI-BOM) extends the traditional software SBOM to include AI-specific components: embedding models, fine-tuned weights, training datasets, retrieval configurations, and third-party API dependencies. CISA has published guidance on SBOMs and regulators are increasingly requiring them for AI systems in regulated industries. An AI-BOM lets you answer questions like: which embedding model version is in production, when was it last updated, what training data was used, and are there known vulnerabilities in any component. Without an AI-BOM, you cannot reliably respond to a supply chain security advisory.

What is index-per-tenant isolation and when should you use it?

Index-per-tenant isolation gives each tenant their own separate vector index, so their data is physically separated from all other tenants. This is the strongest isolation model. It is appropriate when tenant data is highly sensitive, when compliance requires physical separation, or when tenants have very different access patterns. The trade-off is cost: each index consumes dedicated compute and storage. Namespace-based isolation within a shared index is cheaper but weaker, as application-layer bugs can break the isolation. For healthcare, financial services, and government use cases, index-per-tenant is the recommended baseline.

How do you enforce access control on vector search queries at runtime?

Runtime query-time access enforcement means the access decision happens when the query executes, not just at ingestion. There are three mechanisms: database-level namespace filtering (Pinecone namespaces, Qdrant collection-level API keys), row-level security in pgvector, and cryptographic enforcement using VectaX where only vectors matching the user's decryption key are returned. The most robust approach combines all three: namespace isolation limits what the query can see, row-level security filters results, and VectaX encryption ensures that even if a filter is bypassed, the vectors cannot be decrypted by an unauthorised user.

What rotation policies should apply to vector database credentials?

API keys for vector databases should be rotated on a schedule of 30 to 90 days depending on sensitivity. Encryption keys should be rotated annually with re-encryption of stored vectors. User secret keys in VectaX should be re-issued when role assignments change. Service account credentials should be rotated when team members who had access leave. All rotation events should be logged. Rotation should be automated wherever the vector database platform supports it. Long-lived credentials are one of the most common causes of ongoing access after a breach.

What does a vector database monitoring dashboard need to show?

A production vector database monitoring dashboard should show: query volume per service identity over time (to detect sudden changes), failed authentication attempts, queries that return zero results (possible access control bypass attempt or poisoning), queries from unexpected source IPs or service accounts, write events with source identity and namespace, metadata classification mismatches, and key rotation status. Anomaly alerting should trigger on bulk query spikes, off-hours activity, and queries to namespaces by identities that do not normally access them.

How does segmentation reduce risk in vector database deployments?

Segmentation separates development from production environments, separates different tenants or business units, and separates different data classification levels. A well-segmented deployment limits blast radius: a compromise in the development environment does not affect production data, a breach in one tenant's namespace does not expose another tenant's data. Segmentation also simplifies governance by making it clear what data is where, who should have access to it, and what compliance controls apply. In Pinecone, use separate indexes per environment. In Qdrant, use separate collections with collection-level keys. In pgvector, use schema-level separation with role-based access.

What is the difference between RBAC and ABAC for AI systems?

RBAC grants access based on predefined roles such as analyst, engineer, or admin. It is simple to manage and audit but cannot handle dynamic context. ABAC evaluates access based on attributes including user identity, request source, time of day, data classification, and environment. ABAC is better suited to AI systems where agents, automated pipelines, and multiple services interact dynamically. For example, an ABAC policy can allow access only if the request comes from within the company VPN, during business hours, and targets documents classified at the requester's clearance level. VectaX supports multi-dimensional policies combining role, group, and department, bridging RBAC and ABAC.

Vector Store Access Control & Data Governance | Vector DB & RAG Security

Section 01 · The Problem

Why access control fails in vector databases

Access control in vector databases fails for a specific reason that does not apply to traditional databases: the access decision and the data access happen at different places. In a relational database, the query engine enforces access control as part of executing the query. You cannot get data back that you are not authorised to see because the database checks before returning anything.

In most vector database deployments, access control is applied in the application layer, not the database layer. The application adds a namespace filter or metadata condition to the query before sending it. The vector database itself has no concept of which user is making the request or what they are allowed to see. It returns results based purely on similarity, and the application is supposed to filter them.

This creates a gap. Any bug, injection attack, or misconfiguration in the application layer removes all access control, because the database will happily return results from any namespace to any caller. The Cisco 2024 white paper identifies this as the root cause of most unauthorised access incidents in vector database deployments.

The fix has two parts. First, enforce isolation at the database layer through namespaces, collections, or indexes with dedicated credentials. Second, add cryptographic enforcement through VectaX so that even if the application layer is bypassed, the user cannot decrypt vectors outside their policy scope.

The core problem stated plainly: Most vector databases are secure if the application behaves correctly. They are not secure if the application has a bug. Production security requires controls that hold even when the application fails. That means database-layer isolation and cryptographic enforcement, not just application-layer filtering.

Section 02 · Isolation Models

Multi-tenancy isolation: three models, three risk profiles

How you separate tenant data in a shared vector database directly determines what happens when access control breaks. There are three practical models, each with a different security strength and cost profile.

🗂

Namespace / Metadata Filter

All tenants share one index. Each vector is tagged with a tenant ID in metadata. Queries are filtered by tenant ID at the application layer. Cheapest to operate. Most commonly deployed.

Failure mode: Any application bug that drops the filter exposes all tenant data. The database has no way to enforce the boundary.

Weakest isolation

📦

Collection / Namespace Per Tenant

Each tenant gets a dedicated collection (Qdrant) or namespace (Pinecone). Separate API keys per collection. The database enforces that a key for collection A cannot read collection B. Application still manages routing but database provides a second enforcement layer.

Failure mode: A leaked collection-level key exposes one tenant. A misconfigured API key with broad permissions exposes multiple tenants.

Balanced

🔐

Index Per Tenant

Each tenant has a completely separate vector index with dedicated infrastructure. Physical separation. Used in healthcare, financial services, and government where compliance requires data segregation between clients or customers.

Trade-off: Highest cost. Each index consumes dedicated compute and storage. Appropriate when data sensitivity and compliance require it.

Strongest isolation

The namespace-only trap: Most RAG tutorials and quickstarts use namespace metadata filtering because it is the simplest to implement. Teams ship this to production without realising the isolation is entirely dependent on application-layer correctness. A Pinecone namespace is a logical partition within a shared index. It is not a security boundary enforced by the database. Never use namespace filtering as your only isolation mechanism for sensitive data.

Section 03 · Architecture

Pre-filtering vs post-filtering: which RBAC architecture to use

When you add RBAC to vector search, there are two architectural choices for where the access check happens relative to the similarity search itself. This decision affects both security and performance.

1

Apply access policy to restrict the search space to only vectors the user is authorised to see

2

Run similarity search only across the authorised subset

3

Return top-K results from the authorised set

No unauthorised vectors are ever touched. Compute is not wasted on data the user cannot see. Recall is accurate.

1

Run similarity search across the entire index, including vectors the user cannot access

2

Remove results that fail the access check from the response

3

Return whatever authorised results remain, which may be far fewer than top-K

Wastes compute on unauthorised data. When access is selective (user can only see a small fraction), results collapse. The HoneyBee paper (2025) shows post-filtering suffers severe recall degradation at high selectivity.

VectaX implements pre-filtering through cryptographic means. Each vector is encrypted with an access policy. A user's decryption key only works on vectors that match their policy. The similarity search still runs across all vectors in the index (giving full recall), but only vectors within the user's policy scope can be decrypted. An attacker who retrieves an encrypted vector outside their policy scope gets ciphertext they cannot use.

This is better than pure pre-filtering because it does not require maintaining separate indexes per access level. The full index is searched for maximum recall, but access is enforced at the decryption layer, not the search layer.

Section 04 · VectaX RBAC

RBAC with VectaX: policies at role, group, and department level

Standard RBAC assigns permissions to roles such as reader, writer, or admin. This works for coarse-grained access but breaks down in enterprise AI systems where a single user may belong to multiple teams, have different access levels in different departments, and need context-aware permissions that standard roles cannot express.

VectaX implements multi-dimensional RBAC through its RBACVectorData class. Access policies attach to each vector at encryption time across three independent dimensions simultaneously: roles, groups, and departments. A user can only decrypt a vector if their key satisfies all three dimensions of the policy attached to that vector.

Role taxonomy for vector databases

Role	Insert vectors	Query / retrieve	Manage indexes	Delete data	Admin functions	Typical users
Admin	Yes	Yes	Yes	Yes	Yes	Platform owners, security team
Data Scientist	Yes	Yes	Limited	No	No	ML engineers, researchers
Analyst	No	Yes	No	No	No	Business analysts, reporting
Ingestion Service	Yes	No	No	No	No	Automated pipeline services
Retrieval Service	No	Yes	No	No	No	RAG query services, chatbots
AI Agent	Scoped	Scoped	No	No	No	Autonomous agents with defined tasks

VectaX RBAC implementation

Python · VectaX multi-dimensional RBAC (from Qdrant integration docs)

from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import RBACVectorData

# 1. Define the application-level policy (roles available in the system)
app_policy = {
    "roles": ["admin", "hr_manager", "analyst"],
    "groups": ["team_hr", "team_finance", "team_legal"],
    "departments": ["human_resources", "finance", "legal"],
}
sdk.set_policy(app_policy)

# 2. Encrypt a vector with a specific access policy attached
# Only users with hr_manager role in team_hr in human_resources dept can access
vector_data = RBACVectorData(
    vector=embedding,
    id="salary_review_q3_2026",
    access_policy={
        "roles": ["hr_manager", "admin"],
        "groups": ["team_hr"],
        "departments": ["human_resources"],
    }
)
encrypted = sdk.rbac.encrypt(vector_data)

# 3. Generate a per-user decryption key scoped to their actual roles
hr_manager_key = sdk.rbac.generate_user_secret_key({
    "roles": ["hr_manager"],
    "groups": ["team_hr"],
    "departments": ["human_resources"]
})
# analyst_key would NOT be able to decrypt salary_review_q3_2026
analyst_key = sdk.rbac.generate_user_secret_key({
    "roles": ["analyst"],
    "groups": ["team_finance"],
    "departments": ["finance"]
})

Try it live · VectaX Playground

Test multi-dimensional RBAC policies on encrypted vectors in real time

Why three dimensions matter: A user with role analyst in group team_finance and department finance can access finance documents but not HR salary data, even though both are in the same vector index. Their decryption key is scoped to all three dimensions. The HR salary vector requires human_resources department. The key mismatch means the decryption fails cryptographically, not at the application layer. No application bug can bypass this.

Section 05 · ABAC

ABAC: access control for systems that cannot plan ahead

RBAC works well when access patterns are predictable. You know who the HR managers are, what documents they need, and the roles change slowly. But modern AI systems are dynamic. Agents run tasks on behalf of multiple users. Pipelines combine data from different departments. New use cases appear faster than role definitions can be updated.

Attribute-Based Access Control (ABAC) evaluates each access request against a set of attributes at decision time. Instead of asking "is this user an HR manager?", ABAC asks "does this request satisfy all required conditions?" The conditions can include time of day, network source, data classification level, request context, and user attributes simultaneously.

ABAC policy example: context-aware retrieval access

Condition set

request.source = internal_vpn user.clearance >= document.classification time.hour between 07:00 and 20:00 service.name in approved_services

All conditions met

Access granted. Documents returned matching the user's clearance level.

Any condition fails

Access denied. Request logged. Alert triggered if denial pattern is unusual.

ABAC adds complexity but is more appropriate for AI agent systems where the requesting identity is a service account acting on behalf of a human user. The agent's request attributes include both the agent's own identity and the delegated user's attributes. This lets you write policies like: "allow if the agent is an authorised service AND the user it is acting for has appropriate clearance AND the request comes from an approved execution environment."

VectaX's multi-dimensional policy system supports a hybrid of RBAC and ABAC. Roles, groups, and departments are RBAC dimensions. The access policy can be extended with additional attributes at policy definition time. AgentIQ from Mirror Security adds runtime policy enforcement that evaluates request context, which complements VectaX's per-vector access control with system-level ABAC.

Section 06 · Common Failure Mode

Metadata filtering failures: the access control that looks solid but is not

The most common access control pattern in RAG systems is to add a metadata filter to every query. The application knows the user's tenant ID, department, or clearance level, and adds a filter={"tenant_id": "acme_corp"} parameter to the similarity search query. This is easy to implement and works correctly in normal operation.

It breaks in at least four ways.

Four ways metadata filtering fails as sole access control

Failure 1: Application bug drops the filter

An exception handler, a code path for admin preview, or a race condition causes the filter parameter to be omitted from the query. The vector database returns results from all tenants. Without database-layer enforcement, there is no safety net.

Failure 2: Prompt injection removes the filter

A user submits a query with embedded instructions that manipulate the application into constructing a query without the tenant filter. This is a variant of indirect prompt injection targeting the access control logic rather than the LLM output.

Failure 3: Metadata tampering at ingestion

If metadata fields are not validated at ingestion time, an attacker who can insert a document can set its tenant ID to a different tenant's value. The document then appears in that tenant's results even though it was not authorised to be there.

Failure 4: Filter bypass via API direct access

An attacker with direct API access to the vector database (via a leaked key or public endpoint) sends a query without the filter. The application that normally adds the filter is bypassed entirely. The database has no way to know the filter should have been present.

The fix: Metadata filtering is useful as a first-pass efficiency mechanism (reducing the search space) but must not be the only access control. Combine it with collection-level or index-level isolation (so direct API access hits a boundary) and VectaX cryptographic enforcement (so even a correct query returns undecryptable ciphertext to an unauthorised user).

Section 07 · Identity

Identity management: every request needs a traceable identity

Strong access control requires that every request to the vector database can be tied to a known, authenticated identity. Without this, access decisions are anonymous, audit logs are useless, and post-incident forensics cannot determine who accessed what or when.

In AI systems, the identities making requests are not just human users. There are four identity types, each with different security requirements.

👤

Human users

Developers, analysts, and administrators accessing the vector database through applications or direct API calls. Should authenticate via SSO with MFA. User secret keys from VectaX scoped to their role and department.

Well-understood. Standard IAM patterns apply.

⚙

Backend services

The ingestion pipeline, the retrieval service, and the LLM orchestration layer. Each should have its own service identity with scoped credentials. Never share credentials between services.

Risk: over-permissioned service accounts that accumulate access over time.

🕐

Scheduled jobs

Re-indexing jobs, data cleanup tasks, and batch processing pipelines. These run without human oversight and often have broad permissions set during initial setup that are never reviewed. Audit regularly.

Risk: stale permissions that outlive the original purpose of the job.

🤖

Autonomous AI agents

The highest-risk identity type. Agents make decisions and take actions without direct human supervision. They can be compromised through prompt injection, causing them to act outside their intended scope. Each agent needs its own identity, scoped strictly to what its task requires, with AgentIQ monitoring its runtime behaviour.

Risk: prompt injection + excessive permissions = highest blast radius in any RAG system.

Agent identity principle: An agent acting on behalf of user A should not have broader access than user A has. If user A can only retrieve documents from the HR namespace, the agent acting for user A should be constrained to the same namespace. This delegation scoping prevents privilege escalation through the agent layer.

Section 08 · Design Principle

Least privilege: scoping each component to exactly what it needs

Least privilege is the principle that each component in a system should have only the minimum permissions required for its specific function. In vector database deployments, this is frequently violated because teams assign broad permissions during development and never narrow them for production.

The Cisco 2024 white paper specifically identifies least privilege as one of the most effective ways to reduce risk in complex AI systems. Applied to a RAG pipeline, here is what it means in practice:

Least privilege permissions per RAG component

📥

Ingestion service

Write-only access to its designated namespace. Cannot read, query, or delete. Cannot access other namespaces.

Write only

🔍

Retrieval service

Read-only query access to authorised namespaces. Cannot insert, modify, or delete. Cannot access admin endpoints.

Read only

🔗

LLM orchestration layer

Receives retrieved results from the retrieval service. Does NOT have direct access to the vector database. Cannot run its own queries.

No direct DB access

🤖

AI agent

Scoped to the namespaces and operations its task requires. Permissions mirror the user it is acting for. Re-evaluated per task, not assigned permanently.

Task-scoped

🛠

Admin / maintenance

Full access but via MFA-protected accounts that are not used for operational tasks. Separate credentials from service accounts. All admin actions logged and reviewed.

MFA protected

Section 09 · Governance

Data governance: knowing what you have and who owns it

Data governance for vector databases is the set of policies, processes, and records that answer four questions: what data is in the system, who is responsible for it, who has access to it, and what happens to it over time. Without governance, access control reviews are impossible, compliance audits become guesswork, and incident response lacks the information needed to understand blast radius.

📋

Data inventory

A record of every collection, namespace, and index in your vector database: what documents it contains, what classification level applies, which team owns it, and when it was last audited. Without an inventory you cannot answer a regulator's question about what personal data your AI system processes.

🗂

SBOM and AI-BOM

A Software Bill of Materials (SBOM) tracks library versions and dependencies. An AI Bill of Materials (AI-BOM) extends this to embedding models, fine-tuned weights, training datasets, and retrieval configuration versions. When a CVE affects a library or a research paper reveals a vulnerability in an embedding model architecture, your AI-BOM tells you immediately whether you are affected.

📝

Access review process

A scheduled review (quarterly for most systems, monthly for high-sensitivity) that checks whether all service identities, user accounts, and agent identities still need the access they have. People change roles. Services are deprecated. Agents are updated. Permissions that were appropriate six months ago may no longer be necessary.

🔄

Data lifecycle and retention

Vector databases accumulate embeddings over time. Data that should have been deleted under GDPR's right to erasure or a data retention policy may still be present in the vector index as an embedding. Implement a data lifecycle process that tracks when each document was embedded, flags documents past their retention date, and triggers re-indexing when source documents change.

GDPR and the right to erasure: Under GDPR Article 17, individuals can request deletion of their personal data. If a document containing personal data about a user was embedded into your vector index, deleting the source document is not sufficient. You must also delete the corresponding embedding vectors and any cached responses that may contain information derived from that document. This requires knowing exactly which vectors were generated from which source documents, which means maintaining the ingestion audit trail from Module 3.

Section 10 · Operations

Rotation policies and version pinning

Long-lived credentials and unpinned library versions are two operational risks that compound over time. Both are simple to address and both are consistently neglected in fast-moving AI projects.

Credential / asset type	Recommended rotation frequency	Trigger for immediate rotation	Automated?
Vector DB API keys (production)	Every 30 to 60 days	Key exposure, team member departure, breach	Yes, where platform supports it
VectaX user secret keys	On role change or quarterly	User role changes, compromise suspected	Via re-issuance on access review
Encryption master keys (VectaX)	Annually with re-encryption	Key compromise confirmed	Requires re-embedding workflow
Service account credentials	Every 90 days	Service decommissioned, compromise suspected	Yes via secrets manager
Embedding model API keys (OpenAI, Cohere)	Every 60 days	Key leaked, provider advisory	Via secrets manager rotation
Embedding model weights (pinned version)	Review on provider release, update after testing	Security advisory, backdoor discovered	No, requires re-embedding after version change

Version pinning for libraries: Pin the versions of LangChain, LlamaIndex, your vector database client, and the embedding model client in your requirements file. Unreviewed auto-updates have introduced breaking security regressions in orchestration libraries before. Test version updates in a staging environment against your RAG system's security controls before deploying to production. Keep a record of which library versions are running in each environment.

Section 11 · Visibility

Access monitoring: seeing what is actually happening

Access control without monitoring is security theatre. You have defined who should have access, but you have no way to verify that the rules are working, detect when they are being probed, or respond when they fail. Every production vector database deployment needs a monitoring layer that turns raw access events into actionable visibility.

Query volume per identity

Baseline each service identity's normal query rate. Sudden spikes may indicate adversarial probing or a pipeline malfunction. Plot per-identity volume over rolling 24-hour windows.

Alert on 3x normal rate

Failed authentication attempts

Repeated failures from the same source may indicate credential stuffing or key brute-force attempts. Log all authentication failures with source IP and identity.

Alert on 5+ failures per hour

Zero-result queries

A high volume of queries returning zero results can indicate adversarial probing of namespace boundaries or access controls. Legitimate users rarely make queries that return nothing repeatedly.

Investigate clusters of failures

Cross-namespace queries

Queries from identities that do not normally access a particular namespace. A finance service account querying the HR namespace should trigger immediate investigation.

Alert on first occurrence

Off-hours activity

Query or write activity outside normal business hours for identities that should only be active during those hours. Automated pipelines that run 24/7 should be baselined separately.

Alert on human accounts only

Credential rotation status

Track which keys are approaching their rotation date and which have exceeded it. Alert before expiry so rotation happens proactively rather than in response to a breach.

Dashboard metric

Where AgentIQ fits in monitoring: AgentIQ from Mirror Security provides runtime monitoring specifically for AI agent behaviour, including tool calls, retrieved content handling, and output classification. For agentic RAG systems where an agent has retrieve-and-act capabilities, monitoring the agent's behaviour at runtime catches misuse that vector database query logs alone would not surface. Learn about AgentIQ →

Vector Store AccessControl & DataGovernance