The starting point
Why financial AI has a harder problem than most sectors
Most sectors worry about a single sensitive data type. Healthcare worries about PHI. Financial services worries about everything at once. A trading desk AI assistant handles material non-public information, customer account balances, transaction histories, fraud signatures, analyst recommendations, and regulatory filings. Some of these must be kept separate from each other by law, not just by policy.
The insider trading risk alone changes the threat model. If an M&A analyst at a bank can query a RAG system and receive information about a pending deal that their counterpart on the equities desk is already trading, that is a securities violation regardless of whether it was intentional. The AI system becomes the information leak. Standard query-time access control is not enough when a misconfigured retrieval step or a prompt injection attack can bypass it.
There is also a fraud side. Transaction fraud detection often works by comparing new transactions against historical patterns. If those patterns are stored in plaintext, a breach of the fraud detection service exposes every flagged customer's transaction history. VectaX lets the comparison happen on ciphertext: fraud pattern matching that never decrypts the underlying records.
A financial RAG system that decrypts customer data at query time creates two risks: a breach of the retrieval service exposes the full data set, and a prompt injection attack that bypasses access control gives the attacker plaintext records. VectaX encrypted retrieval means a breach yields only ciphertext, and a bypassed access control check still cannot produce readable data without the correct key.
Risk model
Financial data risk taxonomy for AI systems
Financial AI pipelines contain several categories of sensitive data with very different risk profiles. Understanding which category each data type falls into determines how it needs to be protected.
Pending M&A deals, undisclosed earnings data, regulatory actions. Exposure to the wrong internal party is an insider trading violation. Separation is required by law. VectaX RBAC enforces this cryptographically. Risk: critical.
Client portfolios, position sizes, entry prices, and strategy details. A competitor with this data can front-run. A regulator with it will ask how it leaked. The RAG system that embeds trade notes is the most likely leak vector. Risk: high.
Account numbers, balances, KYC records, and correspondence. GLBA and GDPR require confidentiality. Format-preserving encryption protects account numbers while keeping them usable for downstream lookups. Risk: high.
Known fraud patterns and flagged transaction sequences. If these are stored in plaintext, an adversary who accesses the fraud detection system learns exactly which patterns to avoid. Encrypted similarity search protects the signatures. Risk: medium-high.
Information barriers
Chinese walls with VectaX RBAC
A Chinese wall in financial services is an information barrier between departments that prevents conflicts of interest. The classic example is the wall between investment banking (which has MNPI about deals) and sales and trading (which would benefit from it). In a financial AI system, this wall needs to be enforced at the retrieval layer, not just the application layer.
VectaX RBAC generates keys scoped to roles, groups, and departments. A key for group=equities cannot decrypt records tagged for group=investment_banking. The separation is cryptographic. Even if a user constructs a query that should be blocked, the retrieved records cannot be decrypted without the matching key.
An application-layer Chinese wall checks permissions before returning data. A prompt injection attack, a misconfigured query, or a compromised service account can bypass that check and retrieve plaintext data from the wrong department. VectaX RBAC means the check is not the only protection. Even if the application layer is compromised, the data cannot be read without the right key.
Getting started
Setup and initialisation
pip install mirror-sdk mirror_enc
pip install mirror-sdk[examples] # for OpenAI and ChromaDB examples
MIRROR_API_KEY=your-api-key
MIRROR_SERVER_URL=https://mirrorapi.azure-api.net/v1
MIRROR_TELEMETRY_ENABLED=true
MIRROR_POLICY_EVAL_ENABLED=true
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
print("Mirror SDK ready for financial services pipeline")
Step 1
Encrypting trade history embeddings
Trade notes and analyst commentary are the most valuable data in a trading desk RAG system and the most dangerous if exposed. This code embeds a trade note and encrypts the vector before it reaches the database.
import openai
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import VectorData
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
# A representative analyst trade note (reference ID only, no customer name in the vector ID)
trade_note = """
Trade: Long 50,000 shares NVDA @ 142.30. Entry thesis: data center buildout
accelerating, supply constraints easing Q3. Stop: 138.00. Target: 162.00.
Risk/reward 3.9:1. Position size 2.1% of book. Correlated with AMD long.
"""
# Step 1: Embed the trade note
response = openai.embeddings.create(
model="text-embedding-3-small",
input=trade_note
)
embedding = response.data[0].embedding
# Step 2: Encrypt before storage
# Vector ID is a non-identifying reference, not the trader's name
vector = VectorData(vector=embedding, id="trade_eq_2026_0410_001")
encrypted_vector = sdk.vectax.encrypt(vector)
# Step 3: Set access policy scoped to the equities group
# Only equity_analyst role in equities group can decrypt
equities_policy = {
"roles": ["equity_analyst", "portfolio_manager"],
"groups": ["equities"],
"departments": ["trading"]
}
sdk.set_policy(equities_policy)
print("Trade note encrypted and scoped to equities group")
print("Investment banking group cannot decrypt this vector")
Each department's embeddings are encrypted under a separate policy. The vector database holds only ciphertext. A compliance officer with a key scoped to group=compliance can retrieve across groups for surveillance purposes, but only with the key that was generated for that role.
Step 2
Format-preserving encryption for account numbers
Customer account numbers must retain their format for downstream processing. A 10-digit account number needs to still look like a 10-digit account number after encryption, or the rest of your systems break. VectaX format-preserving encryption handles this.
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
# Customer record metadata - contains regulated identifiers
customer_record = {
"source": "retail_brokerage",
"account_number": "4920183756",
"ssn_last4": "6842",
"kyc_tier": "standard",
"open_date": "2019-03-14"
}
# Generate FPE key and tweak, then encrypt
fpe_key = sdk.metadata.generate_key()
fpe_tweak = sdk.metadata.generate_tweak_from_data(customer_record)
encrypted_record = sdk.metadata.encrypt(customer_record, fpe_key, fpe_tweak)
print(f"Original account: {customer_record['account_number']}")
print(f"Encrypted account: {encrypted_record['account_number']}")
print(f"Format preserved: still 10 digits, usable by downstream systems")
# Store fpe_key in your key vault (Azure Key Vault, AWS KMS, etc.)
# Never store it alongside the data
The FTC Safeguards Rule under GLBA requires financial institutions to document their encryption key management procedures. The FPE key generated here must be stored in a managed key vault, not in application code or environment variables. Key rotation schedules must be documented and followed. A key management policy that exists only in someone's head is not GLBA-compliant.
Step 3
Encrypted fraud pattern matching
Fraud detection via RAG works by embedding a new transaction and finding the nearest matches in a store of known fraud patterns. With VectaX, this comparison happens on encrypted vectors. The fraud pattern library never decrypts during the matching process.
import openai
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import VectorData
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
# Step 1: Store a known fraud pattern (done at pattern library build time)
fraud_pattern = """
Pattern: rapid small-value transactions under reporting threshold across
multiple accounts within 24 hours, followed by a single large consolidating
transfer. Accounts opened within 60 days. No prior transaction history.
"""
pattern_embedding = openai.embeddings.create(
model="text-embedding-3-small",
input=fraud_pattern
).data[0].embedding
# Encrypt and store the fraud pattern
# Scoped to fraud_analyst and compliance roles only
encrypted_pattern = sdk.vectax.encrypt(
VectorData(vector=pattern_embedding, id="fraud_pattern_smurfing_v2")
)
# Step 2: At detection time, embed and encrypt the incoming transaction
incoming_tx = """
Customer account opened 22 days ago. 14 deposits ranging $890-$970
across 6 branch locations over 48 hours. Single outbound wire $12,400.
"""
tx_embedding = openai.embeddings.create(
model="text-embedding-3-small",
input=incoming_tx
).data[0].embedding
encrypted_tx = sdk.vectax.encrypt(
VectorData(vector=tx_embedding, id="tx_check_live")
)
# Step 3: Similarity search runs on ciphertext
# The encrypted_tx is compared against encrypted fraud patterns
# The matching score is computed without decrypting either vector
# (pattern: encrypted_results = fraud_vector_db.query_encrypted(encrypted_tx, n_results=5))
print("Transaction embedded and encrypted for pattern matching")
print("Fraud pattern library stays encrypted throughout")
print("Similarity ranking identical to plaintext comparison")
The key property here is that Similarity-Preserving Search guarantees no accuracy loss. A transaction that scores 0.94 cosine similarity against a fraud pattern in plaintext scores 0.94 against the same pattern when both are encrypted. Fraud detection quality is not traded for security.
Step 4
Validating AI financial advice with AgentIQ
Financial AI systems face a specific risk: hallucination. A model that confidently cites a stock price, a regulatory requirement, or a historical return that does not appear in the retrieved context is generating dangerous misinformation. In a regulated context, an AI-generated investment recommendation based on fabricated data is a compliance liability.
AgentIQ hallucination detection compares the model's response against the retrieved context and flags anything the model asserts that is not grounded in source material.
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.mirror_api_models import Action
import logging
logger = logging.getLogger("financial_ai_guard")
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
def validate_financial_response(
analyst_query: str,
retrieved_context: str,
model_response: str
) -> dict:
"""
Validate an AI-generated financial response.
Checks for: hallucination, PII leakage, bias.
Returns a validation report before the response reaches the user.
"""
# 1. Check for hallucination against the retrieved context
hallucination_result = sdk.agentiq.analyze_hallucination(
input=analyst_query,
output=model_response,
context=retrieved_context,
threshold=0.75 # stricter threshold for financial advice
)
is_hallucinated = False
if hallucination_result.pairs:
is_hallucinated = any(
str(p.is_hallucination).lower() == 'true'
for p in hallucination_result.pairs
)
# 2. Scan for PII leakage in the response
# Customer data from retrieved context must not leak to other users
pii_result = sdk.agentiq.detect_pii(
text=model_response,
pii_entities=["NAME", "ACCOUNT_NUMBER", "SSN", "EMAIL", "PHONE"],
action=Action.REDACT
)
# 3. Log and return validation outcome
if is_hallucinated:
logger.warning("Hallucination detected in financial AI response. Blocking.")
if pii_result.entities:
logger.warning(
f"PII detected in response: {[e.label for e in pii_result.entities]}"
)
return {
"approved": not is_hallucinated and len(pii_result.entities) == 0,
"hallucination_detected": is_hallucinated,
"pii_detected": len(pii_result.entities) > 0,
"safe_response": pii_result.redacted_text,
"risk_score": pii_result.risk_score
}
The default hallucination threshold in AgentIQ is 0.5. For financial advice, use 0.75 or higher. A financial AI that says "the fund returned 12.3% in 2024" when the retrieved context says "the fund returned 11.8% in 2024" is not a minor error. In a regulated context it can be material misrepresentation. Set the threshold conservatively.
Step 5
Full trading desk RAG pipeline
This assembles the complete pipeline: encrypted retrieval scoped to the requesting analyst's department, decryption at context assembly, validated AI response.
import openai
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.models import VectorData
from mirror_sdk.core.mirror_api_models import Action
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
def trading_rag_query(
analyst_query: str,
analyst_role: str,
analyst_group: str,
analyst_department: str,
vector_store,
n_results: int = 4
) -> dict:
"""
Run a trading desk RAG query with encrypted retrieval.
The analyst's group scope determines which records are decryptable.
Responses are validated for hallucination and PII before return.
"""
# 1. Generate a role-scoped key for this analyst
analyst_key = sdk.rbac.generate_user_secret_key({
"roles": [analyst_role],
"groups": [analyst_group],
"departments": [analyst_department]
})
# 2. Embed and encrypt the query
query_emb = openai.embeddings.create(
model="text-embedding-3-small",
input=analyst_query
).data[0].embedding
encrypted_q = sdk.vectax.encrypt(
VectorData(vector=query_emb, id="query")
)
# 3. Search encrypted vector store
# Records outside the analyst's key scope cannot be decrypted
encrypted_results = vector_store.query_encrypted(
encrypted_q, n_results=n_results
)
# 4. Decrypt results using the scoped key
contexts = []
for enc_result in encrypted_results:
try:
decrypted = sdk.vectax.decrypt(enc_result)
contexts.append(decrypted.metadata.get("summary", ""))
except Exception:
# Record from a different group: cannot decrypt, skip silently
pass
context_block = "\n\n".join(contexts)
# 5. Generate AI response
prompt = f"""You are a trading desk research assistant.
Answer using only the trade notes and research provided. Do not add
information not present in the notes. If the notes do not address the
question, say so clearly.
Notes:
{context_block}
Query: {analyst_query}
Response:"""
response = openai.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}],
temperature=0.1
)
model_response = response.choices[0].message.content
# 6. Validate before returning
validation = validate_financial_response(
analyst_query, context_block, model_response
)
if not validation["approved"]:
return {
"response": "Response flagged for review. Please contact your compliance team.",
"approved": False,
"reason": validation
}
return {
"response": validation["safe_response"],
"approved": True
}
Regulatory mapping
Compliance coverage for financial AI
Financial AI systems touch several overlapping regulatory regimes. This table maps the controls in this pipeline to the specific requirements.
| Regulation | Requirement | VectaX covers | Gaps to address |
|---|---|---|---|
| GLBA / FTC Safeguards Rule | Encrypt customer financial data; documented key management | Yes FPE + vector encryption | Write and document key rotation policy |
| SEC Regulation S-P | Protect customer records and information | Yes RBAC + encrypted storage | Incident response plan for AI breaches |
| MiFID II | Record-keeping of trading communications and orders | Partial encrypted storage; audit logging is separate | Immutable audit log for all AI-assisted queries |
| DORA (EU) | ICT risk management including AI; resilience testing | Partial encryption controls | Red team testing covered in Module G2 |
| GDPR | Data minimisation; encryption of EU personal data; DPIAs | Yes FPE for PII; RBAC for access minimisation | DPIA documentation; consent management for AI decisions |
| Securities Act (insider trading) | Information barriers between departments with MNPI | Yes cryptographic Chinese walls via RBAC scoping | Written Chinese wall policy; employee training records |
This module covers the data protection side of financial AI compliance. Red teaming the AI system for adversarial attacks, jailbreaks, and prompt injection is in Module G2. A compliant financial AI system needs both: the controls to protect data, and the evidence from red teaming that the controls hold under adversarial conditions.
Common questions
FAQ
Next: Red team this system end-to-end with DiscoveR
Module G2 runs DiscoveR against the financial AI system you built here. You will see what a jailbreak attempt looks like against a trading desk assistant, which attack categories apply to financial AI, and how to interpret and act on the scan results.