What is noise in homomorphic encryption and why does it matter?

Noise is a small random value added to ciphertexts during encryption in lattice-based homomorphic schemes. It is required for security: without noise, the scheme would be breakable. Each arithmetic operation on a ciphertext increases the noise level. Addition increases noise slightly. Multiplication increases noise significantly. When noise grows beyond a threshold, the ciphertext becomes undecryptable. SHE schemes are limited by how many operations can be performed before this threshold is reached. FHE introduces bootstrapping to reduce noise and reset the operational budget.

FHE Deep Dive: PHE, SHE, FHE, CKKS, Noise and Bootstrapping | Track 3D

Q: What is the difference between encryption at rest, in transit, and in use?

Encryption at rest protects data stored on disk or in a database using symmetric ciphers like AES-256. Encryption in transit protects data moving between services using TLS 1.3. Both leave a gap: the processing layer. When data is being computed on, it must be decrypted, exposing it in memory. Encryption in use, implemented through homomorphic encryption, closes this gap by allowing computation directly on ciphertext without decryption.

Q: What is searchable encryption and how does it apply to vector databases?

Searchable encryption allows a database to process queries over encrypted data without decrypting the data. The query is submitted as ciphertext, the database searches using a secure index, and returns encrypted results. The server never sees the plaintext data or query. MongoDB Queryable Encryption is a production implementation. In vector databases, searchable encryption enables similarity search over encrypted embeddings, which is how VectaX allows RAG retrieval without exposing plaintext vectors.

Q: What is the difference between PHE, SHE, and FHE?

Partially homomorphic encryption (PHE) supports only one operation type: either addition or multiplication but not both. RSA is multiplicatively homomorphic; Paillier is additively homomorphic. Somewhat homomorphic encryption (SHE) supports both addition and multiplication but only up to a bounded number of operations before noise accumulation makes the ciphertext undecryptable. Fully homomorphic encryption (FHE) supports unlimited addition and multiplication, enabling any computable function to be evaluated on encrypted data. FHE uses bootstrapping to reset accumulated noise and continue operations.

Q: What is bootstrapping in FHE?

Bootstrapping is the technique introduced by Craig Gentry in 2009 that enables fully homomorphic encryption. It reduces accumulated noise in a ciphertext by homomorphically evaluating the decryption circuit of the FHE scheme itself. The output is a fresh ciphertext encoding the same plaintext with lower noise, allowing further operations to proceed. Bootstrapping is computationally expensive, taking seconds to minutes per operation on current hardware, which is the main performance bottleneck of practical FHE systems.

Q: What is the CKKS scheme and why is it suited for machine learning?

CKKS (Cheon-Kim-Kim-Song 2017) is a homomorphic encryption scheme designed for approximate arithmetic on real and complex numbers. Machine learning computations are inherently approximate: floating-point numbers are already approximations, and small rounding errors in model inference are acceptable. CKKS exploits this by allowing controlled approximation error in ciphertext computations, which makes it significantly more efficient than exact-arithmetic FHE schemes for ML workloads. CKKS also supports SIMD batching, encoding many values into one ciphertext and applying operations in parallel, further improving performance.

Q: How does VectaX implement encrypted similarity search?

VectaX uses Similarity-Preserving Search to encrypt vector embeddings such that the cosine similarity ordering between vectors is preserved over the encrypted values. This means nearest-neighbour search produces the same ranked results on encrypted vectors as it would on plaintext vectors. Access control is enforced at query time using user secret keys derived from role, group, and department attributes. The embedding never appears as plaintext in the vector database, so an attacker with database access cannot reconstruct document content from the stored vectors.

Section 01

Encryption foundations

Before getting into homomorphic encryption, it helps to have a clear picture of what conventional encryption types do, what they protect, and where each one sits in a typical AI system. Most AI deployments use several encryption types simultaneously, each covering a different part of the data flow. There are five: symmetric, asymmetric, encryption in transit, encryption at rest, and encryption in use. The first four are standard practice. The fifth is what this module is really about.

🔑

Symmetric encryption

AES-128 / AES-256 / ChaCha20

One key encrypts and decrypts. Fast and efficient for large volumes of data. AES-256 is the standard for data at rest in databases, file systems, and backups. The same key must be shared securely between parties.

Used for: database storage, disk encryption, object storage

🔐

Asymmetric encryption

RSA-2048 / RSA-4096 / ECC / Ed25519

A public key encrypts. A private key decrypts. Slower than symmetric but solves the key distribution problem. Used to securely exchange symmetric keys rather than to encrypt bulk data directly.

Used for: key exchange, digital signatures, TLS handshakes

👥

Encryption in transit

TLS 1.3 / mTLS / HTTPS

Protects data moving between two points: client and server, microservice and database, embedding model and vector store. Data is decrypted at each endpoint before processing. The channel is secure; the endpoints are not.

Used for: API calls, service-to-service communication, user sessions

📜

Encryption at rest

AES-256-GCM / AES-256-CBC

Protects data stored on disk or in a database from physical theft or storage-level breach. Does not protect data during processing: it must be decrypted before the application can use it. Standard in cloud deployments.

Used for: S3 buckets, database volumes, model weight storage

🔒

Encryption in use

FHE (CKKS, BFV, BGV) / Searchable Encryption / SMPC

The missing piece. Protects data while it is being computed on, not just stored or moved. The server processes ciphertext directly. No decryption ever occurs on the server. This is what homomorphic encryption and searchable encryption deliver, and it is the core of what VectaX implements in a RAG pipeline.

PHE: one operation type SHE: bounded depth FHE: unlimited computation VectaX uses this

Encryption at rest and in transit are both standard practice and together they cover the storage and transport layers. But neither protects data during processing. Encryption in use closes this gap. Industry research into vector database deployments has found that many vector stores ship without native encryption support. Start by confirming your vector store covers at-rest and in-transit. Then add encryption in use for the processing layer.

Section 02

The processing gap and encryption in use

Encryption at rest protects data on disk. Encryption in transit protects data on the wire. Both share one weakness: data must be decrypted before it can be processed. The embedding model reads plaintext documents. The vector database stores plaintext embeddings. The LLM receives plaintext context. The processing layer is exposed.

This is where encryption in use comes in. The term covers any technique that allows computation to happen on encrypted data without decrypting it first. The server processes ciphertext. The result, when decrypted by the client, matches what you would get from computing on the plaintext. The server learns nothing about the underlying data.

This gap matters more for AI systems than for traditional applications because of the inference attacks covered in D1. An attacker who can access the vector store at the processing layer can extract embedding vectors that partially reconstruct document content. Encryption in use closes this gap at the layer where the attack happens.

Encryption capability spectrum — encryption in use covers rungs 3 to 6

1

Encryption at rest Standard

Protects stored data from physical access or storage breach. Data decrypted before any processing occurs.

Protected: storage layer only

2

Encryption in transit Standard

Protects data moving between services. Decrypted at every endpoint before processing.

Protected: network layer only

IN USE

Searchable / Queryable encryption Encryption in use

Search and query over encrypted data. Server never sees plaintext. Production-ready for specific query types including similarity search.

Protected: search and retrieval layer

IN USE

Partially homomorphic (PHE) Encryption in use

One arithmetic operation type (add or multiply) on encrypted data. Limited but efficient for narrow use cases.

Protected: single operation type

IN USE

Somewhat homomorphic (SHE) Encryption in use

Both addition and multiplication on encrypted data up to a bounded depth. Limited by noise accumulation.

Protected: bounded computation depth

IN USE

Fully homomorphic (FHE) Full encryption in use

Unlimited arithmetic on encrypted data. Any computable function. Bootstrapping resets noise. The most complete form of encryption in use available today. Used in VectaX.

Protected: full computation, unlimited depth

Section 03

Searchable encryption

Searchable encryption (SE), also called queryable encryption, solves a specific problem: how to let a database server process search queries without ever seeing the plaintext data or query. The server operates entirely on ciphertext. Only the client who holds the keys can read the results.

This is the technique that bridges conventional encryption and full homomorphic encryption. It does not support arbitrary computation, but it supports the specific operations that vector databases need most: equality checks, range queries, and similarity search.

Client (User)

Query: plaintext

Encrypt query
using local key

Encrypted query
(ciphertext)

Key stays with client. Never sent to server.

Encrypted
query sent

Database Server

Receives ciphertext query

Searches secure
index on ciphertext

Returns encrypted
matching results

Server never decrypts. Never sees plaintext.

Encrypted
results

Client (User)

Receives encrypted
results

Decrypts using
local key

Reads plaintext
results

Decryption happens client-side only.

Security research into vector database deployments highlights searchable encryption as a key technique for protecting queries in RAG implementations. At every step of a RAG pipeline, from query to retrieval to response, there is an opportunity for data to be intercepted in plaintext. Searchable encryption closes this by ensuring neither the query nor the retrieved data ever appears as plaintext on the server side.

MongoDB Queryable Encryption is a production implementation. It uses a secure index built on structured encryption schemes. The encryption keys are held by the application, not MongoDB. AWS KMS, Google Cloud KMS, and Azure Key Vault are all supported as key providers via the KMIP protocol.

In vector databases, the same principle is extended to similarity search: the query vector is encrypted, the stored embeddings are encrypted, and the similarity search operates on ciphertext. This is what VectaX implements using Similarity-Preserving Search, covered in Section 12.

Section 04

The homomorphic encryption spectrum

Homomorphic encryption (HE) is a class of encryption schemes that allows mathematical operations to be performed directly on ciphertext. The result of the operation, when decrypted, matches the result of performing the same operation on the original plaintext. The server computes but never reads the data.

HE comes in three levels of power, trading off capability against computational cost.

PHE

Partially Homomorphic

Addition on ciphertextPaillier: Yes

Multiplication on ciphertextRSA: Yes

Both add + multiplyNo

Unlimited operationsNo

Noise managementNot needed

PerformanceFast

ML workloadsVery limited

SHE

Somewhat Homomorphic

Addition on ciphertextYes

Multiplication on ciphertextYes

Both add + multiplyYes

Unlimited operationsNo (depth limited)

Noise managementAccumulates

PerformanceModerate

ML workloadsShallow networks

FHE

Fully Homomorphic

Addition on ciphertextYes

Multiplication on ciphertextYes

Both add + multiplyYes

Unlimited operationsYes (bootstrapping)

Noise managementBootstrapping

PerformanceSlow (improving)

ML workloadsFull support

Section 05

Partially homomorphic encryption (PHE)

Partially homomorphic encryption supports exactly one arithmetic operation on encrypted data. You can either add ciphertexts or multiply them, but not both in the same scheme. This sounds limiting, and it is, but single-operation homomorphism is enough for a surprisingly wide range of practical tasks.

RSA is multiplicatively homomorphic. If you encrypt two numbers m1 and m2 under the same RSA public key, multiplying the two ciphertexts together produces a ciphertext that decrypts to m1 times m2. This property was not intentionally designed into RSA; it falls out of the modular exponentiation structure. It enables private set intersection, some types of digital voting, and specific machine learning prediction tasks.

The Paillier cryptosystem (1999) is additively homomorphic. Multiplying two Paillier ciphertexts together produces a ciphertext that decrypts to the sum of the two plaintexts. Paillier is widely used for privacy-preserving aggregation: tallying encrypted votes without decrypting individual ballots, summing encrypted financial data across parties, and computing encrypted dot products for federated model aggregation.

✖

RSA: multiplicative homomorphism

E(m1) * E(m2) = E(m1 * m2)

Multiplying two RSA ciphertexts gives the encryption of the product of the plaintexts. Useful for private set intersection and certain prediction tasks. Not additively homomorphic.

Use cases: digital signatures, private equality tests, some voting schemes

✚

Paillier: additive homomorphism

E(m1) * E(m2) = E(m1 + m2)

Multiplying two Paillier ciphertexts gives the encryption of the sum of the plaintexts. Paillier multiplication of a ciphertext by a plaintext constant also works (scalar multiplication).

Use cases: encrypted vote counting, privacy-preserving aggregation, federated dot products

PHE limitation for machine learning. Neural networks require both addition and multiplication: weighted sums (addition) followed by activation functions (polynomial approximations of multiplication). PHE cannot compute both in the same encrypted domain. This is why SHE and FHE are needed for general ML workloads.

Section 06

Somewhat homomorphic encryption (SHE)

Somewhat homomorphic encryption supports both addition and multiplication on ciphertext, which makes it capable of evaluating any polynomial function of the encrypted data. The catch is that it can only do so up to a bounded number of operations before the ciphertext becomes unreadable.

The bound exists because of noise. SHE schemes add a small random noise value to every ciphertext during encryption. This noise is required for security: without it, the scheme would be algebraically breakable. Each arithmetic operation performed on a ciphertext changes its noise level. Addition increases noise by a small amount. Multiplication increases noise significantly. When the noise grows beyond a threshold, the decryption algorithm can no longer correctly recover the plaintext.

The maximum number of sequential operations (particularly multiplications) that a SHE scheme can support before noise becomes fatal is called the multiplicative depth. A scheme with multiplicative depth 10 can evaluate any polynomial up to degree 2 to the power 10, but not deeper circuits. This is enough for inference with shallow neural networks but not for training deep networks or running large language model inference.

Why multiplicative depth matters for AI. A typical transformer layer involves matrix multiplication, layer normalisation, and activation functions. Each of these is a separate multiplication in the encrypted domain. A 12-layer transformer model requires hundreds of sequential multiplications. SHE schemes hit their noise threshold long before completing inference on even a small transformer, which is why FHE with bootstrapping is required for LLM-scale encrypted inference.

Section 07

Noise accumulation

Noise is the central challenge in homomorphic encryption. Every modern HE scheme based on lattice cryptography adds noise to ciphertexts during encryption. The noise is part of the security design: it makes the ciphertext indistinguishable from random to anyone without the key. But it also accumulates as operations are performed, and it eventually breaks decryption.

Think of noise as a budget. You start with a fresh ciphertext and a full noise budget. Each addition spends a little of the budget. Each multiplication spends a lot. When the budget runs out, the ciphertext is corrupted: decryption will produce garbage.

How noise grows through operations (relative scale)

Fresh ciphertext (after encryption)

Low noise

Decryption fails above this line

Small random noise added during encryption. Decryption works easily.

After 5 additions

Slight growth

Addition adds noise linearly. After 5 additions, noise is still well within bounds.

After 1 multiplication

Moderate growth

Multiplication roughly squares the noise. One multiplication uses more budget than dozens of additions.

After 3 multiplications

High noise

Still decryptable but approaching the threshold. Few more operations possible.

After 5+ multiplications (SHE limit)

THRESHOLD EXCEEDED

Decryption fails. Ciphertext is corrupted. FHE bootstrapping is needed to reset the budget.

The noise growth rate depends on the encryption parameters. Larger parameters give more noise budget but produce larger ciphertexts and slower operations. Tuning these parameters is a key engineering task when deploying HE for ML: you need enough budget to complete the required computation without making ciphertexts so large that performance becomes impractical.

Section 08

Fully homomorphic encryption (FHE)

Fully homomorphic encryption solves the noise problem and removes the operational depth limit. With FHE, you can evaluate any computable function on encrypted data, including deep neural networks, arbitrary SQL queries, and general-purpose programs, without the server ever seeing the plaintext.

The breakthrough came in 2009 when Craig Gentry, then a PhD student at Stanford, published the first FHE construction. His key insight: you can use the homomorphic properties of the scheme to homomorphically compute the decryption function itself, on an encrypted ciphertext. The result is a fresh ciphertext with lower noise that encodes the same plaintext. He called this bootstrapping.

Gentry's original scheme was impractically slow. The bootstrapping operation alone took so long that it was useful only as a theoretical proof. But it established that FHE was possible, and the field has been improving the performance of bootstrapping continuously since 2009. Modern FHE schemes can bootstrap in seconds to minutes depending on parameters and hardware.

FHE computation model

Step 1: Client

Encrypt data
Enc(m)

Key stays on client

→

Step 2: Server

Compute f()
on Enc(m)

No decryption. Result is Enc(f(m))

→

Step 3: Client

Decrypt
Enc(f(m)) = f(m)

Correct result without server seeing data

The HE model works as follows: mathematical operations (addition, multiplication) are performed directly on the encrypted data. The result of the computation is decrypted to reveal the final output. This model protects AI systems by enabling model inference on encrypted inputs: the client encrypts their query, the server runs inference on the ciphertext, and the client decrypts the result.

2009 Craig Gentry · A Fully Homomorphic Encryption Scheme (PhD thesis, Stanford University)

Section 09

Bootstrapping

Bootstrapping is the technique that converts SHE into FHE. It resets accumulated noise in a ciphertext so that further operations can be performed. The concept is counterintuitive: to reduce noise in a ciphertext, you homomorphically evaluate the decryption circuit of the scheme itself, on the noisy ciphertext, using an encrypted copy of the secret key.

The output is a fresh ciphertext encoding the same plaintext but with noise reset to its initial level. This allows the computation to continue indefinitely, as long as you bootstrap before each noise threshold is reached.

1

Ciphertext reaches high noise level

Several multiplications have used up most of the noise budget. More operations would corrupt the ciphertext.

High noise

2

Re-encrypt the ciphertext under fresh parameters

The noisy ciphertext is encrypted again, this time under a fresh set of parameters. The secret key is encrypted using the scheme's own public key (key switching).

Still high noise

3

Homomorphically evaluate the decryption circuit

The decryption function is computed on the doubly-encrypted ciphertext. This is itself a computation performed homomorphically on fresh ciphertexts, so it starts with low noise.

Computing

4

Output: fresh ciphertext, same plaintext

The result is a ciphertext that encodes the same plaintext as the original but has noise at the level introduced by step 3, not the accumulated level from before. Noise budget is restored.

Low noise

5

Continue computation

The fresh ciphertext can be used in further additions and multiplications. Bootstrapping can be called again whenever needed, enabling unlimited sequential computation.

Full budget

Bootstrapping has historically been the main performance bottleneck of FHE. Early schemes took minutes per operation. This was the primary reason FHE stayed in research labs rather than moving to production. The field has invested heavily in algorithmic improvements (CKKS GEMV kernels, fused softmax), hardware acceleration (A100/L40S GPU optimisation, Triton inference), and operator fusion. Section 12 covers how Mirror Security has addressed this with real benchmark numbers from production deployments.

Section 10

CKKS for machine learning

CKKS (Cheon-Kim-Kim-Song, 2017) is the FHE scheme most commonly used for machine learning. It was designed specifically for approximate arithmetic on real and complex numbers, which makes it a natural fit for floating-point ML computations.

The key insight behind CKKS: floating-point numbers are already approximations. When you store 0.1 as a 32-bit float, you are storing an approximation with rounding error. If your computation introduces a small additional approximation error during HE operations, the result is still useful for ML inference where small errors are acceptable. CKKS formalises this by treating the noise in ciphertexts as part of the approximation rather than as an error to be eliminated. This makes operations in CKKS significantly more efficient than in exact-arithmetic FHE schemes.

CKKS also supports SIMD batching (Single Instruction, Multiple Data). A single CKKS ciphertext can encode a vector of values, and arithmetic operations on that ciphertext apply to all values in the vector simultaneously. This is the same principle as GPU parallelism applied to encrypted computation.

CKKS SIMD batching: one ciphertext, many values, parallel operations

Vector A (plain)

1.2

3.7

0.5

2.1

4.8

1.0

Encrypt A

Enc([1.2, 3.7, 0.5, 2.1, 4.8, 1.0]) → one ciphertext

Vector B (enc)

Enc([2.0, 1.5, 3.0, 0.8, 1.2, 2.5]) → one ciphertext

↓ Multiply ciphertexts (one operation, all slots computed in parallel)

Result (enc)

Enc([2.4, 5.55, 1.5, 1.68, 5.76, 2.5]) → all products, one operation

Without SIMD batching, each multiplication would require a separate ciphertext operation. Batching reduces ciphertext operations by a factor equal to the slot count, which is typically in the thousands for production CKKS parameters.

CKKS also manages noise growth through a technique called rescaling. After each multiplication, the ciphertext's scale (the implicit denominator used to represent real values) doubles. Rescaling divides the scale back down, keeping the magnitude of ciphertext values manageable. This is more efficient than bootstrapping for managing noise within a computation of bounded depth.

For machine learning inference, the typical workflow is: encode input features as a CKKS vector, apply the model's linear layers as batched ciphertext multiplications, apply activation function approximations (polynomial approximations of ReLU or sigmoid), and repeat per layer. Bootstrapping is called between layers when needed to reset noise.

2017 Cheon, Kim, Kim, Song · Homomorphic Encryption for Arithmetic of Approximate Numbers (ASIACRYPT 2017)

Section 11

Open-source FHE libraries

There are three major open-source FHE libraries. Each implements one or more HE schemes and targets different use cases. Choosing the right library depends on the computation type, programming language, and performance requirements.

Library	Schemes supported	Language	Best for	Key characteristics
Microsoft SEAL	BFV CKKS	C++ (Python bindings)	ML inference, cloud applications	Industry-grade, well-documented, widely deployed. BFV for exact integer arithmetic. CKKS for approximate real-number ML.
HElib	BGV CKKS	C++	Research, bootstrapping-heavy workloads	IBM Research origin. First practical bootstrapping implementation. BGV scheme with efficient SIMD batching. Lower-level control.
Pyfhel	BFV CKKS BGV	Python	Prototyping, ML research, education	Python wrapper over Microsoft SEAL and HElib. Integrates with NumPy. Lower performance than native C++ but much faster to prototype with.
OpenFHE	BFV CKKS BGV FHEW	C++ (Python bindings)	Production ML, DARPA DPRIVE project	Successor to PALISADE. Strong bootstrapping support. FHEW and TFHE schemes for boolean circuit evaluation. Hardware acceleration support.

VectaX abstracts the library layer. Rather than requiring engineers to work directly with SEAL, HElib, or OpenFHE, VectaX provides a high-level SDK where you call sdk.vectax.encrypt(vector) and sdk.vectax.decrypt(encrypted). The underlying FHE scheme and parameters are managed by VectaX for the specific use case of vector embedding encryption with similarity preservation.

Section 12

FHE shortfalls, and how VectaX addresses them

FHE has had three well-known problems that kept it out of production for over a decade. They are real. Anyone evaluating FHE for AI workloads needs to understand them, and understand what has changed.

🕒

Latency: seconds per operation

Early FHE bootstrapping operations took minutes on CPU. Even optimised implementations on GPUs took seconds per layer of a neural network. This made real-time inference impossible.

VectaX fix: CKKS GEMV and Softmax kernels optimised for A100/L40S-class GPUs with operator fusion. TTFT for a 7B model at 50-150ms. Client-side encryption at 2-5ms.

📊

Throughput: a fraction of plaintext

FHE inference ran at 1-5% of plaintext throughput in early systems. 100 tok/s plaintext became 2-5 tok/s encrypted. Commercially useless for most LLM applications.

VectaX fix: 150-240 tok/s on 7B models (vs 200-300 plaintext). 38-72 tok/s on 70B models (vs 50-90 plaintext). 225-400 tok/s on MoE (vs 300-500 plaintext). Gap is closing.

🏭

Memory quality loss

CKKS approximate arithmetic introduced quality degradation in retrieval results. Similarity scores drifted. Retrieved documents were less relevant. This made encrypted RAG impractical for production.

VectaX fix: NDCG@5 of 0.954 (vs 0.97 plaintext). 98% temporal accuracy. Retrieval p95 under 8ms. Memory quality loss under 2% in production deployments.

🧴

Integration complexity

FHE libraries required deep cryptography expertise to use. No out-of-the-box integration with standard ML frameworks, vector databases, or inference servers. Each deployment was a bespoke engineering project.

VectaX fix: SDK integration with NVIDIA Nemo, Tensor RT, Triton Inference, vLLM, SGLang, FasterTransformer, and all major vector DBs. Single function call for encrypt/decrypt.

LLM inference throughput — A100 / L40S-class GPU

Plaintext

Mirror FHE (VectaX)

7B tok/s

200-300

200-300 tok/s

150-240

150-240 tok/s

70B tok/s

50-90

50-90 tok/s

38-72

38-72 tok/s

MoE tok/s

300-500

300-500 tok/s

225-400

225-400 tok/s

TTFT 7B (ms)

20-50ms

20-50ms plaintext

50-150ms

50-150ms encrypted

Client encrypt

2-5ms

2-5ms per request

AI memory (VectaX) quality metrics

0.954

vs 0.97 plain

NDCG@5

Normalised discounted cumulative gain. Retrieval relevance quality. Loss under 2% vs plaintext.

98%

vs 99% plain

Temporal Accuracy

Consistency of retrieval results over time. Handles temporal drift and numeric consistency.

<8ms

vs 4ms plain

Retrieval p95

95th percentile retrieval latency. Under 8ms with encryption overhead on production hardware.

3-5ms

vs 2ms plain

BM25 Search

Keyword search latency over encrypted index. Production-viable at 3-5ms per query.

The headline number: memory quality loss under 2%. The NDCG@5 of 0.954 versus 0.97 plaintext means that in ranked retrieval tasks, the documents returned by VectaX encrypted search are nearly as relevant as those returned by plaintext search. For most production RAG use cases, a sub-2% quality trade-off for complete cryptographic protection of the embedding pipeline is an acceptable engineering decision. Mirror Security, March 2026.

Production integrations

NVIDIA Nemo Memory NVIDIA Nemo Retriever NVIDIA Tensor RT NVIDIA Triton Inference NVIDIA Dynamo / NVCF NVIDIA Dynamo vLLM SGLang FasterTransformer All Vector DBs

A100/L40S-class GPUs · FHE CKKS GEMV/Softmax kernels

Section 13

VectaX implementation

VectaX applies the cryptographic techniques in this module to a specific problem: making RAG pipelines private. More precisely, VectaX delivers encryption in use for vector databases. The core challenge is that vector embeddings in a traditional RAG system are stored and searched in plaintext. As shown in D1, this creates an attack surface: an adversary with access to the vector database can partially reconstruct document content from stored embeddings.

With VectaX, the embeddings are never plaintext on the server side. The entire pipeline from storage through retrieval operates on ciphertext. This is encryption in use applied to RAG: computation (similarity search) happens on encrypted data, without decryption, and the correct results come back.

VectaX uses Similarity-Preserving Search, a specialised form of searchable encryption. The encryption function transforms a plaintext vector into a ciphertext vector such that the cosine similarity ordering between any two ciphertext vectors is the same as the ordering between their corresponding plaintext vectors. Nearest-neighbour search on encrypted vectors returns the same top-k results as nearest-neighbour search on plaintext vectors.

1

Document ingestion

Raw documents read from source. Embedding model called to generate vector representations.

Plaintext

2

VectaX encryption

sdk.vectax.encrypt(VectorData(vector=embedding, id="doc_1"))

Encrypted

3

RBAC policy applied

sdk.rbac.generate_user_secret_key({"roles": ["analyst"], "departments": ["finance"]})

Role-gated

4

Encrypted storage in ChromaDB

Encrypted vectors stored. No plaintext embeddings in the database. Attacker with DB access cannot reconstruct documents.

Protected

5

Encrypted similarity search

Query encrypted before search. Similarity ordering preserved over encrypted vectors. Same top-k results as plaintext search.

Encrypted

6

Decryption for authorised output

sdk.vectax.decrypt(encrypted_result) called for authorised users only. LLM receives plaintext context.

Plaintext

7

Metadata protection (FPE)

Format-preserving encryption applied to document metadata. Encrypted metadata maintains original format (string, number, date) for compatibility with downstream systems.

FPE

VectaX combines three of the techniques in this module. Similarity-Preserving Search handles the vector layer. Format-preserving encryption (FPE) handles metadata like document IDs, dates, and category fields. RBAC using attribute-based cryptography enforces access control at query time: a user's secret key encodes their roles, groups, and departments, and only queries from users with the right attributes return decryptable results.

This architecture directly addresses the RAG threat model described in D1: sensitive data needs to be protected, but it also needs to be usable. VectaX keeps it usable through similarity preservation and keeps it protected through encryption that never leaves the vector database in plaintext.

Section 14

Secure multiparty computation (SMPC)

Secure multiparty computation (SMPC) is an alternative to homomorphic encryption for the problem of computing on private data held by multiple parties. Rather than encrypting data so that one party can compute on it, SMPC distributes the computation itself across multiple parties so that no single party sees the full plaintext.

SMPC allows multiple parties to jointly compute a function over their inputs while keeping those inputs private. Each party learns only the result of the computation, not the individual inputs of the other parties.

In AI, SMPC is most often used in federated learning scenarios. Multiple organisations each hold private training data. They want to train a shared model without sending their raw data to a central server. SMPC allows them to aggregate model gradients or parameters during training such that each party contributes to the model update without revealing their individual training records.

SMPC and HE are often used together. SMPC protocols typically require significant communication between parties: each round of computation involves multiple message exchanges. This communication overhead can be reduced by using HE. Parties can homomorphically encrypt their inputs before participating in the SMPC protocol, and the computation over encrypted inputs reduces the number of rounds required.

👥

SMPC in federated learning

Multiple parties train local models on private data. SMPC aggregates model updates (gradients or weights) at the server without the server seeing any individual party's update. Used in healthcare and finance consortiums.

🔄

HE plus SMPC

Parties encrypt inputs using HE before participating in the SMPC protocol. The server performs homomorphic operations on encrypted inputs, reducing the number of communication rounds needed and lowering the trust requirement on the server.

D5 covers SMPC in depth. This module focuses on the FHE techniques that are the cryptographic foundation of VectaX. The relationship between SMPC and federated learning, including threat models, poisoning attacks, and production architectures, is covered in modules D4 and D5.

Section 15

Frequently asked questions

What is the difference between encryption at rest, in transit, and in use?

Encryption at rest protects data stored on disk or in a database using symmetric ciphers like AES-256. The data must be decrypted before any application can use it. Encryption in transit protects data moving between services using TLS 1.3. Data is decrypted at every endpoint before processing. Both leave the same gap: the processing layer. Encryption in use closes this gap. It is the umbrella term for techniques that allow computation directly on ciphertext without decryption: searchable encryption for query operations, partially and somewhat homomorphic encryption for bounded arithmetic, and fully homomorphic encryption for arbitrary computation. VectaX delivers encryption in use specifically for vector embedding storage and retrieval in RAG pipelines.

What is searchable encryption and how does it apply to vector databases?

Searchable encryption allows a database server to process queries over encrypted data without decrypting it. The query is submitted as ciphertext, the database searches using a secure index, and returns encrypted results. The server never sees the plaintext data or query. MongoDB Queryable Encryption is a production implementation using AWS KMS, Google Cloud KMS, or Azure Key Vault as key providers. In vector databases, searchable encryption enables similarity search over encrypted embeddings, which is how VectaX allows RAG retrieval without exposing plaintext vectors to the database server.

What is the difference between PHE, SHE, and FHE?

Partially homomorphic encryption (PHE) supports only one operation type: either addition or multiplication but not both. RSA is multiplicatively homomorphic; Paillier is additively homomorphic. Somewhat homomorphic encryption (SHE) supports both addition and multiplication up to a bounded number of operations before noise accumulation makes the ciphertext undecryptable. Fully homomorphic encryption (FHE) supports unlimited operations by using bootstrapping to reset accumulated noise. FHE enables any computable function to be evaluated on encrypted data but is computationally more expensive than PHE or SHE.

What is bootstrapping in FHE and why is it computationally expensive?

Bootstrapping is the technique Craig Gentry introduced in 2009 that converts SHE into FHE. It reduces accumulated noise in a ciphertext by homomorphically evaluating the decryption circuit of the FHE scheme itself. The output is a fresh ciphertext encoding the same plaintext with lower noise, allowing further operations to proceed. Bootstrapping is expensive because it involves performing many homomorphic operations on freshly encrypted ciphertexts, which all start with their own noise budgets. A single bootstrapping operation can take from hundreds of milliseconds to several minutes depending on parameters and hardware.

Why is CKKS better than other FHE schemes for machine learning?

CKKS (Cheon-Kim-Kim-Song 2017) is designed for approximate arithmetic on real and complex numbers. Machine learning computations are inherently approximate: floating-point numbers are already approximations, and small rounding errors in model inference are acceptable. CKKS exploits this by allowing controlled approximation error, which makes it significantly more efficient than exact-arithmetic FHE schemes. CKKS also supports SIMD batching, encoding many values into one ciphertext and applying operations in parallel across all slots, which provides significant throughput improvement for matrix and vector operations common in neural networks.

How does VectaX implement encrypted similarity search without decrypting vectors?

VectaX uses Similarity-Preserving Search, a specialised searchable encryption scheme. The encryption function transforms a plaintext vector into a ciphertext vector such that the cosine similarity ordering between any two ciphertext vectors is the same as between their corresponding plaintext vectors. Nearest-neighbour search on encrypted vectors returns the same top-k results as search on plaintext vectors. Access control is enforced through RBAC using attribute-based cryptography: a user's secret key encodes their roles, groups, and departments, and only queries from authorised users return decryptable results. The database server never processes plaintext embeddings at any point.

FHE Deep Dive

Encryption foundations

The processing gap and encryption in use

Searchable encryption

Searchable encryption applied to RAG: how VectaX does it

The homomorphic encryption spectrum

Partially homomorphic encryption (PHE)

Somewhat homomorphic encryption (SHE)

Noise accumulation

Fully homomorphic encryption (FHE)

Bootstrapping

CKKS for machine learning

Open-source FHE libraries

FHE shortfalls, and how VectaX addresses them

VectaX implementation

Try the full encrypted RAG pipeline in the VectaX playground

Secure multiparty computation (SMPC)

Frequently asked questions

VectaX: encryption in use for your RAG pipeline