Encrypted AI Inference on PHI with VectaX

The core problem

Most healthcare AI has a decryption window

In 2023, the HHS Office for Civil Rights received 725 breach reports affecting 500 or more individuals. The majority involved systems that stored data encrypted but processed it in plaintext. Ransomware groups targeting healthcare learned this fast: if you can get into the inference environment, the patient records are right there in memory.

The standard approach to healthcare AI security works like this: encrypt PHI at rest in the database, decrypt it when a query comes in, run the embedding model, decrypt again for similarity search, generate the response, then re-encrypt for storage. The data is only exposed during processing. The problem is that processing is when everything interesting happens. Any attacker who compromises the AI service, the embedding model server, or the vector database connection sees plaintext patient records.

Fully Homomorphic Encryption changes this constraint. FHE lets you perform computation directly on encrypted data. The result, when decrypted, is identical to what you would have got from running the same computation on plaintext. The processing environment never sees the actual patient data.

What the risk looks like in practice

A clinical AI assistant running on a shared GPU cluster decrypts PHI embeddings to perform similarity search. Any process on that GPU with sufficient privilege can read memory. Container escape vulnerabilities, hypervisor attacks, and rogue insiders all have the same opportunity: catch the data in the decryption window. FHE eliminates the window.

This module builds a clinical AI pipeline using VectaX that processes patient data without ever creating that decryption window. You will encrypt the embeddings, run similarity search on ciphertext, and generate responses without the underlying records ever existing in plaintext inside your processing environment.

Risk model

PHI risk taxonomy for AI systems

Before building, you need a clear picture of what PHI exists in a clinical AI pipeline and where it is vulnerable. The 18 HIPAA identifiers are the legal definition, but the security risk is not uniform across them. Some identifiers are far more dangerous in an AI context than others.

📌

Free-text clinical notes

The highest-risk category for RAG systems. Notes contain multiple identifiers in natural language. They are the primary input for clinical AI and the most likely to appear in LLM responses if the pipeline leaks context. Risk: high.

📋

Structured records and IDs

Patient IDs, MRNs, insurance numbers, and dates. These appear in vector database metadata. Format-preserving encryption protects them while keeping the data usable for lookups. Risk: medium-high.

📊

Embedding vectors

Embeddings of clinical text are not PHI by themselves, but they can be inverted. Several published attacks recover meaningful information from medical embeddings. Storing them in plaintext is a compliance risk under HIPAA's minimum necessary rule. Risk: medium.

👤

Access logs and query patterns

Who queried which patient record is itself sensitive. Audit logs need integrity protection, and query patterns can leak information even when the records are encrypted. Risk: medium, often overlooked.

VectaX addresses the first three directly. Clinical note embeddings are encrypted before storage using similarity-preserving vector encryption. Structured identifiers use format-preserving encryption. Access logs are covered in Module F2. This module covers the inference and retrieval side.

How VectaX works

Three capabilities that make this possible

VectaX SDK gives you three distinct capabilities that work together in a healthcare pipeline.

Similarity-preserving vector encryption

Clinical note embeddings are encrypted with an algorithm that preserves vector geometry. Cosine similarity between two encrypted vectors produces the same ranking as cosine similarity between their plaintext equivalents. This means your similarity search on encrypted clinical embeddings returns the same results as the same search on plaintext. There is no retrieval accuracy loss.

Format-preserving encryption (FPE)

Patient IDs, MRNs, and insurance numbers have format requirements. A patient ID that looks like "P-20240158" still needs to look like a patient ID after encryption, or downstream systems break. VectaX FPE encrypts these fields while maintaining their original format. Encrypted MRN still looks like an MRN.

Role-based access control with encrypted keys

VectaX RBAC generates per-user secret keys scoped to roles, groups, and departments. A physician can decrypt records in their department. A billing administrator cannot access clinical notes. The access policy is cryptographically enforced, not just checked at the application layer.

The key difference

Most RBAC implementations check permissions at query time and then return plaintext data. If the permission check is bypassed, the data is exposed. VectaX RBAC is cryptographic: a user without the right key physically cannot decrypt the data, regardless of what the application layer does.

Getting started

Installation and setup

VectaX is part of the Mirror SDK. You need a Mirror Security Platform account and an API key. The SDK also requires the encryption library.

bashinstall

# Install core SDK
pip install mirror-sdk

# Install with examples dependencies (OpenAI, ChromaDB)
pip install mirror-sdk[examples]

# Install encryption library (required for VectaX FHE features)
pip install mirror_enc

bash.env

# Required
MIRROR_API_KEY=your-api-key
MIRROR_SERVER_URL=https://mirrorapi.azure-api.net/v1

# Optional: telemetry and policy eval
MIRROR_TELEMETRY_ENABLED=true
MIRROR_POLICY_EVAL_ENABLED=true

pythoninit.py

from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig

# Load from environment variables (recommended for production)
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)

# Verify the connection
print("Mirror SDK ready")

Step 1

Encrypting clinical embeddings

The first step is embedding your clinical notes and encrypting the resulting vectors before they go anywhere. This code generates an embedding using OpenAI, then encrypts it with VectaX before storing it.

pythonembed_and_encrypt.py

import openai
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import VectorData

config = MirrorConfig.from_env()
sdk = MirrorSDK(config)

# A representative clinical note (anonymised for this example)
clinical_note = """
Patient: 58-year-old male presenting with chest pain and shortness of breath.
History of hypertension. ECG shows sinus tachycardia. Troponin elevated at 0.8.
Assessment: Rule out NSTEMI. Admit for monitoring and cardiology consult.
"""

# Step 1: Generate the embedding
response = openai.embeddings.create(
    model="text-embedding-3-small",
    input=clinical_note
)
embedding = response.data[0].embedding

# Step 2: Wrap in VectorData and encrypt
# patient_001 is a non-identifying reference ID (the real MRN is FPE-encrypted separately)
vector = VectorData(vector=embedding, id="patient_001_admission_note_01")
encrypted_vector = sdk.vectax.encrypt(vector)

print(f"Original vector length: {len(embedding)}")
print(f"Encrypted vector stored. The processing environment holds no plaintext.")

What is encrypted here

The VectorData object contains the numeric embedding. The encrypted_vector object stores the ciphertext. From this point, the original embedding vector does not exist in memory unless you call sdk.vectax.decrypt() with a valid key. The clinical note text itself should be stored separately in an encrypted document store and never passed to the vector database in plaintext.

Step 2

RBAC for healthcare roles

Different staff need different access. A cardiologist should retrieve cardiac notes from their own department. A pharmacy technician should not. VectaX RBAC enforces this cryptographically.

pythonrbac_setup.py

from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig

config = MirrorConfig.from_env()
sdk = MirrorSDK(config)

# Define access policy for the clinical AI system
# These map to your organisation's actual role structure
clinical_access_policy = {
    "roles": ["physician", "nurse_practitioner"],
    "groups": ["cardiology"],
    "departments": ["medicine"]
}

sdk.set_policy(clinical_access_policy)

# Generate a key for a cardiology physician
# This key can decrypt records accessible to physicians in cardiology
physician_key = sdk.rbac.generate_user_secret_key({
    "roles": ["physician"],
    "groups": ["cardiology"],
    "departments": ["medicine"]
})

# Generate a key for a billing administrator
# This key cannot access clinical notes, only billing-scoped records
billing_key = sdk.rbac.generate_user_secret_key({
    "roles": ["billing_admin"],
    "groups": ["finance"],
    "departments": ["administration"]
})

print("Physician key generated for cardiology scope")
print("Billing key generated for administration scope")
print("These keys cannot decrypt each other's records")

Store these keys in your key management system, not in application code. Module F2 covers the key lifecycle and rotation requirements for HIPAA compliance.

Role	Accessible records	Cannot access	HIPAA basis
physician	Clinical notes, diagnostics, orders in their department	Records from other departments without treatment relationship	Treatment exception, minimum necessary
nurse_practitioner	Clinical notes in their unit	Records outside their unit, billing data	Treatment exception, minimum necessary
billing_admin	Billing codes, insurance identifiers	Clinical notes, diagnostic data	Operations exception, minimum necessary
researcher	De-identified data only	All PHI without specific authorization	Research exception requires IRB or waiver

Step 3

Format-preserving encryption for patient IDs

Patient MRNs, insurance numbers, and other structured identifiers need to stay in their original format. Your downstream systems may validate format. FPE handles this: the encrypted value looks like the original format.

pythonfpe_identifiers.py

from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig

config = MirrorConfig.from_env()
sdk = MirrorSDK(config)

# Patient metadata with structured identifiers
patient_metadata = {
    "source": "cardiology_ehr",
    "admission_date": "2026-04-10",
    "mrn": "P-20240158",
    "insurance_id": "BCB-7821-0044"
}

# Generate FPE key and tweak from the metadata structure
fpe_key = sdk.metadata.generate_key()
fpe_tweak = sdk.metadata.generate_tweak_from_data(patient_metadata)

# Encrypt the metadata
# The mrn and insurance_id fields are encrypted but preserve their format
encrypted_metadata = sdk.metadata.encrypt(patient_metadata, fpe_key, fpe_tweak)

print(f"Original MRN:   {patient_metadata['mrn']}")
print(f"Encrypted MRN:  {encrypted_metadata['mrn']}")
print(f"Format preserved: still looks like a patient ID")

Key management note

The FPE key must be stored securely and separately from the data. If the key is lost, the metadata is permanently unreadable. Module F2 covers key escrow, rotation, and backup requirements for HIPAA. Do not store keys in environment variables in production.

Step 4

Encrypted similarity search on clinical notes

This is the capability that makes an encrypted clinical RAG system practical. You can run similarity search on the encrypted embeddings without decrypting them. The query vector is encrypted, sent to the database, compared against encrypted stored vectors, and the ranked results come back still encrypted.

pythonencrypted_search.py

import openai
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import VectorData

config = MirrorConfig.from_env()
sdk = MirrorSDK(config)

# The clinician's query
query = "chest pain with elevated troponin, rule out NSTEMI"

# Step 1: Embed the query
query_embedding = openai.embeddings.create(
    model="text-embedding-3-small",
    input=query
).data[0].embedding

# Step 2: Encrypt the query vector before sending to the database
query_vector = VectorData(vector=query_embedding, id="query")
encrypted_query = sdk.vectax.encrypt(query_vector)

# Step 3: Search encrypted vectors against encrypted store
# Your ChromaDB or vector DB holds only encrypted vectors
# The similarity computation happens on ciphertext
# (shown here as pseudocode for the search call pattern)
# encrypted_results = vector_db.query(encrypted_query, n_results=5)

# Step 4: Decrypt results only at the point of use
# and only with a key scoped to the requesting user's role
# physician_key grants decryption for physician-scoped records only
# decrypted_vector = sdk.vectax.decrypt(encrypted_result)

print("Query encrypted before leaving the user's context")
print("Search operates on ciphertext end to end")
print("Decryption only happens with role-scoped key at point of use")

Step 5

Full inference pipeline

This combines everything into a working clinical AI assistant. The patient data is encrypted at the source, retrieved via encrypted similarity search, and the context passed to the LLM is constructed without ever persisting plaintext in the AI service layer.

pythonclinical_ai_pipeline.py

import openai
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import VectorData

config = MirrorConfig.from_env()
sdk = MirrorSDK(config)

def clinical_rag_query(
    clinician_query: str,
    clinician_key: dict,
    vector_store,   # your encrypted ChromaDB or similar
    n_results: int = 3
) -> str:
    """
    Run a clinical RAG query over encrypted patient records.
    The patient data stays encrypted throughout retrieval.
    Decryption happens only at context assembly, inside this function.
    """

    # 1. Embed the clinician's query
    query_emb = openai.embeddings.create(
        model="text-embedding-3-small",
        input=clinician_query
    ).data[0].embedding

    # 2. Encrypt the query vector before search
    encrypted_q = sdk.vectax.encrypt(
        VectorData(vector=query_emb, id="query")
    )

    # 3. Search the encrypted vector store
    # This comparison runs on ciphertext
    encrypted_results = vector_store.query_encrypted(
        encrypted_q, n_results=n_results
    )

    # 4. Decrypt each result using the clinician's role-scoped key
    # A key without the right role will fail here cryptographically
    contexts = []
    for enc_result in encrypted_results:
        decrypted = sdk.vectax.decrypt(enc_result)
        contexts.append(decrypted.metadata.get("summary", ""))

    # 5. Build prompt with decrypted context
    context_block = "\n\n".join(contexts)
    prompt = f"""You are a clinical decision support tool.
Use the following patient record summaries to answer the clinician query.
Do not introduce information not present in the records.

Records:
{context_block}

Query: {clinician_query}

Response:"""

    # 6. Call the LLM
    response = openai.chat.completions.create(
        model="gpt-4o",
        messages=[{"role": "user", "content": prompt}],
        temperature=0.1
    )

    return response.choices[0].message.content

Where decryption happens

Decryption occurs inside clinical_rag_query at step 4. The decrypted text exists in memory only for the duration of context assembly. It is not logged, not stored, and not persisted. The vector store, the embedding service, and the LLM API all operate on either encrypted data or generated text. PHI does not traverse any network boundary in plaintext.

Before going live

Production checklist

Healthcare AI systems face scrutiny at audit time. This checklist reflects what OCR investigators actually ask for after an incident.

Area	Requirement	VectaX covers
PHI at rest	All PHI encrypted at rest with documented algorithm	Yes vector encryption + FPE
PHI in transit	TLS 1.2+ for all PHI data in motion	Partial configure TLS separately
PHI in processing	Minimum necessary access; PHI not exposed in logs	Yes RBAC keys + no plaintext in logs
Access control	Role-based, cryptographically enforced, auditable	Yes RBAC with scoped keys
Key management	Keys stored separately from data; rotation documented	Partial use KMS; covered in F2
Audit logging	Logs of all PHI access with user, timestamp, record	Partial covered in F2
Breach notification	Can demonstrate what records were accessed if breach occurs	Yes RBAC scope limits blast radius
Business associate	BAA in place with any cloud provider processing PHI	Manual sign BAA with your cloud provider

What this module does not cover

This module covers the encryption and retrieval pipeline. It does not cover audit trail architecture, key lifecycle management, or the HIPAA documentation requirements. Those are in Module F2. Do not go to production without completing both modules.

Common questions

FAQ

What is encrypted AI inference on PHI?

Encrypted AI inference on PHI means running an AI model on patient data that remains encrypted throughout the entire computation. The model never sees the plaintext. This is possible using Fully Homomorphic Encryption, which lets you compute on ciphertext and produce a result that, when decrypted, matches what you would have got from running the same computation on plaintext.

Why can't healthcare AI systems just encrypt data at rest?

Encryption at rest protects stored data from breach, but the data must be decrypted before processing. This means PHI is exposed in memory during every inference call, embedding generation, and similarity search. Any compromise of the processing environment exposes patient records. FHE eliminates this window entirely.

Does VectaX support HIPAA-required access controls?

Yes. VectaX RBAC generates per-user secret keys scoped to roles, groups, and departments. A key generated for role=physician can decrypt records accessible to physicians, while a key for role=billing_admin cannot access clinical notes. This maps directly to HIPAA minimum necessary access requirements.

Does encrypting PHI embeddings affect retrieval quality?

No. VectaX vector encryption is similarity-preserving. The nearest-neighbor ranking of encrypted vectors is identical to that of plaintext vectors. A search for clinically similar notes on encrypted embeddings returns the same results as the same search on plaintext embeddings.

What PHI can VectaX protect in a clinical AI pipeline?

VectaX can protect any data that can be represented as a vector embedding or structured metadata. This includes clinical note embeddings, diagnostic embeddings, patient ID fields using format-preserving encryption, insurance numbers, and any structured record fields. The vector encryption preserves similarity search capability, so a clinical RAG system can still find relevant records without decrypting them.

How does VectaX handle the trade-off between encryption and AI accuracy?

VectaX uses similarity-preserving vector encryption, which means the encrypted vectors maintain their geometric relationships. Cosine similarity between two encrypted vectors produces the same ranking as cosine similarity between their plaintext equivalents. There is no accuracy loss for retrieval tasks. For FHE inference, there is a latency cost, but no accuracy degradation.

Next: HIPAA audit trail and key management

Module F2 covers the compliance side: immutable audit logs, key rotation schedules, what OCR investigators actually look for, and how to generate evidence from your VectaX deployment.

Continue to F2 → VectaX documentation →