The regulatory baseline
What HIPAA actually requires
The HIPAA Security Rule has four implementation specification categories that apply directly to an AI system processing PHI. Understanding exactly which CFR sections apply is not just academic: OCR investigators cite specific sections in investigation letters, and your documentation needs to address them by name.
Unique user identification, automatic logoff, encryption and decryption. Every user of the AI system must be identifiable in logs. Access must be role-scoped and revocable.
Hardware, software, and procedural mechanisms to record and examine access to PHI. For AI: every query, every retrieval, every model response that included PHI in context.
PHI must not be improperly altered or destroyed. For AI systems this includes protecting model outputs from being altered and protecting audit logs from tampering.
Encryption of PHI in transit. TLS 1.2+ minimum. This applies to every network hop in your AI pipeline, including internal calls between services.
The Breach Notification Rule adds a separate set of obligations. If you cannot demonstrate, through logs, that a suspected breach did not expose PHI, HIPAA presumes a breach occurred. A complete audit trail is your primary defence against presumed breach liability.
Under 45 CFR 164.414, if you cannot demonstrate low probability of PHI compromise, HHS presumes a breach occurred and notification is required. Gaps in your audit log are not neutral. They are evidence of probable breach. An AI system with no query-level logging creates presumed breach liability for every incident, regardless of whether PHI was actually accessed.
Design
What a HIPAA-compliant AI audit log must contain
HIPAA does not specify a log format, but OCR has been consistent in what it looks for. The following fields are required to support breach analysis and to demonstrate minimum necessary access.
| Field | Why it is required | Example value |
|---|---|---|
| event_id | Unique identifier for each log entry; required for integrity verification | evt_8b2f4a1c9d |
| timestamp_utc | Exact time in UTC; supports incident timeline reconstruction | 2026-04-10T14:22:08.441Z |
| user_id | Unique user identifier (not name; name is PHI) | usr_Dr_A_449 |
| user_role | Role at time of access; supports minimum necessary analysis | physician |
| department | Department at time of access; supports minimum necessary analysis | cardiology |
| action | What the user did: query, retrieve, view, export | rag_query |
| record_ids_accessed | Which patient records were retrieved; required for breach scope analysis | ["rec_7f2a", "rec_9c1b"] |
| query_hash | Hash of the query text (not the text itself); proves the query existed without storing PHI | sha256:3d4e... |
| response_phi_detected | Whether AgentIQ detected PHI in the model's response; required for output monitoring | false |
| session_id | Links related events; supports session-level incident analysis | ses_f9e2c7 |
| log_hash | Hash of all other fields; tamper detection | sha256:a1b2... |
Storing the actual query text or response text in an audit log means your audit log is itself a PHI store, subject to all HIPAA requirements. Log the hash of the query, not the query. Log whether PHI was detected in the response, not the response. The record IDs accessed are usually safe if they are non-identifying reference IDs, not MRNs or names.
Implementation
Building the audit log
This implementation captures each PHI access event and writes a tamper-evident log entry. It uses AgentIQ to check whether the model's response contained PHI before writing the outcome.
import hashlib
import json
import uuid
from datetime import datetime, timezone
from dataclasses import dataclass, field
from typing import List, Optional
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.mirror_api_models import Action
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
@dataclass
class AuditEvent:
user_id: str
user_role: str
department: str
action: str
record_ids_accessed: List[str]
query_hash: str
session_id: str
response_phi_detected: bool = False
event_id: str = field(default_factory=lambda: f"evt_{uuid.uuid4().hex[:10]}")
timestamp_utc: str = field(
default_factory=lambda: datetime.now(timezone.utc).isoformat()
)
log_hash: str = ""
def compute_hash(self) -> str:
"""Compute a tamper-detection hash of this log entry."""
payload = {
"event_id": self.event_id,
"timestamp_utc": self.timestamp_utc,
"user_id": self.user_id,
"user_role": self.user_role,
"department": self.department,
"action": self.action,
"record_ids_accessed": self.record_ids_accessed,
"query_hash": self.query_hash,
"session_id": self.session_id,
"response_phi_detected": self.response_phi_detected,
}
canonical = json.dumps(payload, sort_keys=True)
return hashlib.sha256(canonical.encode()).hexdigest()
def log_phi_access(
sdk: MirrorSDK,
user_id: str,
user_role: str,
department: str,
query_text: str,
record_ids: List[str],
model_response: str,
session_id: str
) -> AuditEvent:
"""
Log a PHI access event with tamper detection.
Checks the model response for PHI leakage before writing.
Does not store the query text or response text in the log.
"""
# Hash the query (do not store the query itself)
query_hash = "sha256:" + hashlib.sha256(query_text.encode()).hexdigest()
# Check model response for PHI leakage using AgentIQ
phi_result = sdk.agentiq.detect_pii(
text=model_response,
pii_entities=["NAME", "EMAIL", "PHONE", "SSN", "MRN", "DATE"],
action=Action.ALERT
)
phi_detected = len(phi_result.entities) > 0
# Build the audit event
event = AuditEvent(
user_id=user_id,
user_role=user_role,
department=department,
action="rag_query",
record_ids_accessed=record_ids,
query_hash=query_hash,
session_id=session_id,
response_phi_detected=phi_detected
)
# Compute tamper-detection hash and write it into the event
event.log_hash = event.compute_hash()
# Write to your audit log store
# This should be an append-only, write-once store (e.g. Azure Immutable Blob Storage)
write_audit_log(event)
if phi_detected:
trigger_phi_leakage_alert(event, phi_result.entities)
return event
Tamper protection
Log integrity protection
An audit log that can be modified is not a HIPAA audit log. Under 45 CFR 164.312(c)(1), PHI must be protected from improper alteration. This applies to the audit log itself. If an attacker can delete or modify log entries after the fact, the log cannot prove minimum necessary access.
Three mechanisms protect log integrity in production:
Append-only storage
Use a storage service with immutability features. Azure Immutable Blob Storage and AWS S3 Object Lock both support time-based retention policies where objects cannot be deleted or overwritten for a defined period. Set this to at least six years to meet HIPAA retention requirements.
Entry-level hashing
Each log entry includes a hash of its own fields, as shown in the code above. Any modification to a log entry invalidates its hash. A periodic verification job can detect tampering by recomputing hashes and comparing them.
Log chain hashing
For stronger integrity, chain log entries: include the hash of the previous entry in the hash of the current entry. This creates a structure where modifying any entry invalidates all subsequent entries, making silent tampering computationally infeasible.
import json
import hashlib
def verify_audit_log_integrity(log_entries: list) -> dict:
"""
Verify that no audit log entries have been tampered with.
Returns a report of any integrity failures.
"""
failures = []
for entry in log_entries:
stored_hash = entry.get("log_hash", "")
# Recompute the hash from the entry fields
check_fields = {k: v for k, v in entry.items() if k != "log_hash"}
canonical = json.dumps(check_fields, sort_keys=True)
expected_hash = hashlib.sha256(canonical.encode()).hexdigest()
if stored_hash != expected_hash:
failures.append({
"event_id": entry.get("event_id"),
"timestamp_utc": entry.get("timestamp_utc"),
"issue": "hash_mismatch"
})
return {
"total_entries": len(log_entries),
"failures": failures,
"integrity_verified": len(failures) == 0
}
Key management
Key lifecycle management for VectaX deployments
Every VectaX deployment has at least two types of keys: the FPE keys used for format-preserving encryption of structured identifiers, and the RBAC keys scoped to individual users. Both require a documented lifecycle.
HIPAA does not prescribe a key rotation interval. The requirement under 45 CFR 164.312(a)(2)(iv) is that you have documented procedures for encryption and decryption. The key management policy is what gets reviewed, not the specific interval. Industry consensus for healthcare is 90 days for active keys.
| Key type | Rotation interval | On compromise | Retention after rotation |
|---|---|---|---|
| FPE metadata key | 90 days or as per policy | Immediate rotation | 6 years (to decrypt archived data) |
| RBAC user key | On role change or departure | Immediate revocation | 30 days (pending session close) |
| RBAC department key | Annual or on policy change | Immediate rotation | 6 years (to decrypt archived data) |
| Audit log signing key | Annual | Immediate rotation | 6 years (to verify historical logs) |
Never store VectaX keys in environment variables in production. Use a managed key vault: Azure Key Vault, AWS KMS, or HashiCorp Vault. The vault provides access logging, rotation automation, and key escrow. If your organisation is audited, the key vault's access logs are primary evidence of key management compliance.
Implementation
Key rotation implementation
This shows the pattern for rotating an FPE key and re-encrypting metadata records. In production this runs as a scheduled job at your defined rotation interval.
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
import logging
logger = logging.getLogger("hipaa_key_rotation")
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
def rotate_fpe_key(
metadata_records: list,
old_key: str,
key_vault_client
) -> dict:
"""
Rotate the FPE key for patient metadata.
1. Generate a new key in the key vault
2. Re-encrypt all metadata records with the new key
3. Archive the old key (retain for 6 years for historical decryption)
4. Log the rotation event for compliance
"""
# Step 1: Generate new key
new_key = sdk.metadata.generate_key()
new_key_id = key_vault_client.store_key(
key=new_key,
label="vectax-fpe-active"
)
logger.info(f"New FPE key generated: {new_key_id}")
# Step 2: Re-encrypt all metadata records
reencrypted = []
failed = []
for record in metadata_records:
try:
# Decrypt with old key
tweak = sdk.metadata.generate_tweak_from_data(record["encrypted_metadata"])
plaintext_metadata = sdk.metadata.decrypt(
record["encrypted_metadata"], old_key, tweak
)
# Re-encrypt with new key
new_tweak = sdk.metadata.generate_tweak_from_data(plaintext_metadata)
new_encrypted = sdk.metadata.encrypt(plaintext_metadata, new_key, new_tweak)
reencrypted.append({
"record_id": record["record_id"],
"encrypted_metadata": new_encrypted,
"key_id": new_key_id
})
except Exception as e:
logger.error(f"Failed to re-encrypt record {record['record_id']}: {e}")
failed.append(record["record_id"])
# Step 3: Archive (do not delete) the old key
key_vault_client.archive_key(old_key, label="vectax-fpe-retired")
# Step 4: Log the rotation for compliance evidence
logger.info(
f"FPE key rotation complete. "
f"Records re-encrypted: {len(reencrypted)}. "
f"Failures: {len(failed)}. "
f"New key ID: {new_key_id}"
)
return {
"new_key_id": new_key_id,
"reencrypted_count": len(reencrypted),
"failed_records": failed
}
Output monitoring
PHI leakage in AI outputs
This is the part most healthcare AI implementations miss. A clinical AI system can generate a response that includes a patient name, date of service, or diagnosis without anyone intending it to. This happens when clinical context is injected into the LLM prompt and the model includes specifics in its response. Any such output that reaches an unauthorised viewer is a potential breach.
AgentIQ's PII detection can scan model responses before they are returned to the user. If PHI is detected, the response can be blocked or the sensitive text redacted before display.
from mirror_sdk.core.mirror_core import MirrorSDK, MirrorConfig
from mirror_sdk.core.mirror_api_models import Action
import logging
logger = logging.getLogger("phi_output_guard")
config = MirrorConfig.from_env()
sdk = MirrorSDK(config)
# PHI entity types relevant to clinical AI outputs
CLINICAL_PHI_ENTITIES = [
"NAME", # Patient name
"DATE", # Dates of service, birth dates
"PHONE", # Phone numbers
"EMAIL", # Email addresses
"SSN", # Social security numbers
"MRN", # Medical record numbers
"ADDRESS", # Geographic identifiers
"MEDICAL_RECORD" # Diagnosis codes, procedure codes
]
def guard_clinical_output(
model_response: str,
action: Action = Action.REDACT
) -> dict:
"""
Scan a clinical AI model response for PHI.
REDACT by default: returns the response with PHI replaced.
ALERT: returns the response unchanged but flags for review.
BLOCK: raises an exception if PHI is detected.
"""
result = sdk.agentiq.detect_pii(
text=model_response,
pii_entities=CLINICAL_PHI_ENTITIES,
action=action
)
if result.entities:
logger.warning(
f"PHI detected in clinical AI output. "
f"Entities: {[e.label for e in result.entities]}. "
f"Action: {action}. "
f"Risk score: {result.risk_score}"
)
return {
"original_length": len(model_response),
"phi_detected": len(result.entities) > 0,
"entity_count": len(result.entities),
"entity_types": [e.label for e in result.entities],
"risk_score": result.risk_score,
# With REDACT action, this is the safe version to return to the user
"safe_response": result.redacted_text if action == Action.REDACT else model_response
}
If you implement nothing else from this module, implement output PHI detection. It is the control that directly maps to the HIPAA minimum necessary rule for AI outputs and the one that OCR has been asking about in recent audit questionnaires for healthcare AI systems.
Incident response
Breach detection workflow
HIPAA breach notification timelines are strict. You have 60 days from discovery to notify affected individuals. Discovery is not when you confirm a breach, it is when you become aware of it. A good audit log and automated alerting are what allow you to detect a potential breach quickly and begin the clock accurately.
Your VectaX deployment limits breach scope in two ways. First, encrypted data accessed without the right key is unreadable, so a storage breach does not automatically become a PHI breach. Second, RBAC scoping means a compromised user credential can only access records in that user's key scope, not the entire database.
Compliance documentation
Generating evidence for OCR
When OCR opens an investigation, they send a data request letter specifying what documentation is required. Healthcare organisations that have automated evidence generation respond faster and with more complete documentation. This is what they typically ask for.
| OCR request item | Your source | VectaX or AgentIQ coverage |
|---|---|---|
| Risk analysis document | Your security policy documentation | Manual must be written |
| Access control policies | Your RBAC configuration and key scope definitions | Exportable from VectaX RBAC config |
| Evidence of encryption | VectaX deployment documentation | Documented FHE + FPE + RBAC |
| Audit logs for the relevant period | Your audit log store | Automated with integrity hashes |
| Evidence of PHI output monitoring | AgentIQ scan results | Automated via detect_pii logs |
| Key management policies | Your key vault audit logs + written policy | Partial vault logs + manual policy |
| Business associate agreements | Signed BAA with cloud providers | Manual legal requirement |
| Training records | Your LMS or training platform | Manual HR/training requirement |
| Incident response records | Your incident tickets and timeline | Partial audit logs provide timeline evidence |
Most healthcare AI teams have good technical controls and poor documentation. OCR does not assess your security by looking at your code. They assess it by reading your policies, reviewing your audit logs, and asking your Security Officer to describe your procedures. The controls mean nothing without the documentation that proves they exist and are followed. Write the policies before you go to production, not after an incident.
Common questions
FAQ
Healthcare track complete
You have built an encrypted clinical AI pipeline and its compliance infrastructure. VectaX handles the encryption. AgentIQ handles output monitoring. The audit trail handles the evidence. Contact Mirror Security to discuss production deployment for your organisation.