DiscoveR – Automated AI Red Teaming

Overview

What is DiscoveR?

DiscoveR is Mirror Security's automated red teaming product. It finds exploitable weaknesses in AI applications before adversaries do.

Manual red teaming does not scale. A security team testing one LLM application might run a few hundred probes in a day. DiscoveR runs thousands of targeted adversarial attacks across eight attack categories, classifies each probe as passed or failed, and returns a structured vulnerability report with severity ratings you can act on immediately.

The fix-and-verify loop is built into the product. After patching your application, rerun the exact same attack prompts to confirm the fix is effective. Track improvement over time using scan correlation chains. Integrate the whole process into your CI/CD pipeline so security testing becomes part of every deployment.

Architecture

The Automated Red Team Loop

DiscoveR operates as a continuous improvement loop. Register your application once. Configure categories and budget. Run the scan. Review findings. Fix your application. Immediately rerun to verify. Every iteration is tracked with a shared correlation ID so the complete history of a security test sequence is queryable as a single chain.

DiscoveR Scan Loop

01

Register App

Endpoint, type, domain

→

02

Configure

Categories, budget

→

03

Attack

1000s of adversarial probes

→

04

Results

Vulnerability report + severity

→

05

Fix

Add guardrails, update prompts

→

06

Rerun

Verify fixes, track regression

↻ Rerun feeds back into Step 03 — same attack prompts, same correlation chain, traceable history

Supported Application Types

DiscoveR can scan any AI application regardless of how it is built or hosted. The application type determines how DiscoveR interacts with the system.

🌐

REST API

Standard HTTP chatbots and LLM endpoints. POST requests, JSON response. Instant validation.

type="api"

⚡

Streaming SSE

OpenAI-compatible streaming endpoints. Establishes SSE connection, reads chunked responses.

transport: "sse"

🔄

WebSocket

Real-time chat and voice agent backends. Bidirectional streaming, persistent connection.

transport: "websocket"

🖥️

Browser App

GUI-based chat UIs. Headless browser agent navigates, logs in, and reads the chat UI directly.

type: "web-application"

Type	How DiscoveR Connects	Validation Time
REST API	POST requests to endpoint; reads JSON response	Seconds
Streaming SSE	Establishes SSE connection; reads first chunk to validate	Seconds
WebSocket	Opens WebSocket; validates connection handshake	Seconds
Browser App	Headless browser launches, attempts login, fingerprints UI	1-2 minutes

Attack Coverage

Attack Categories

Eight curated categories, each targeting a distinct class of AI vulnerability. DiscoveR distributes prompt budget across selected categories using weighted priority: high-impact attacks receive approximately 70% of the budget.

Category	What It Tests	Time	Best For
quickScan	Core injection, basic jailbreaks, essential security checks	5-10 min	CI/CD, daily validation
jailbreakAndInjection	Prompt injection, jailbreaks, DAN attacks, bypass techniques	20-40 min	All AI applications
extractionAttacks	System prompt extraction, config leakage, knowledge theft	10-20 min	Agentic systems
ragSecurity	Hallucination induction, context poisoning, retrieval attacks	25-45 min	RAG applications
agentSecurity	Agent alignment drift, goal hijacking, personalization attacks	20-35 min	AI agents
modelAndCodeSecurity	Model theft attempts, code injection via tool calls	30-50 min	Production models
biasAndSafety	Bias, fairness, safety compliance, harmful content generation	15-30 min	Regulated industries
trainingDataPrivacy	PII leakage from training data, memorisation extraction	10-40 min	Sensitive data systems

Prompt Budget Guide

The max_depth parameter controls the total number of adversarial prompts DiscoveR executes across a scan. More prompts mean deeper coverage and longer scan time. Choose a budget that fits your pipeline stage and risk tolerance.

10-20

Quick smoke test

2-10 min

Basic coverage — catches obvious regressions

30-60

Regular security scan

10-20 min

Good coverage — standard validation

80-100

Pre-release assessment

20-40 min

Thorough — suitable for release gates

150+

Deep security audit

40-60+ min

Maximum — adversarial thoroughness

Rerun and Scan Comparison

After fixing vulnerabilities, DiscoveR reruns the exact same attack prompts that exposed them. This confirms the fix was effective and surfaces any regressions. Every scan in a rerun chain shares a correlation_id inherited from the original scan. The complete history of a security test sequence is queryable as a unit.

You can also create a rerun that targets only the prompts that revealed vulnerabilities in the previous scan. This reduces scan time during iterative fixing without losing coverage on the specific attack vectors that mattered.

Field	Original Scan	Rerun Scan
parent_scan_id	null	ID of the parent scan
correlation_id	Equals own scan ID	Inherits parent correlation_id
rerun flag	false	true

Custom Datasets: BYOAP

Bring Your Own Attack Prompts. Upload datasets of adversarial prompts specific to your application's domain, regulatory environment, or threat model. BYOAP is for cases where Mirror's built-in attack library does not cover a specific risk your compliance programme requires you to test. Custom datasets run alongside built-in categories or independently.

Integration

CI/CD Pipeline Integration

DiscoveR is designed to run unattended inside automated pipelines. The scan API is synchronous: create a scan, poll for completion, read results, return a pass/fail signal.

Pull Request

Smoke test on every commit

quickScan, depth: 10-20

Fast enough for a CI check. Catches obvious regressions before code reaches review. Fails the build on any critical finding.

Pre-merge to Main

Deeper coverage before merge gate

jailbreakAndInjection, depth: 30

More thorough scan before code reaches the main branch. Targets the highest-impact attack categories.

Pre-release / Staging

Comprehensive release gate

All categories, depth: 80-100

Thorough scan before production deployment. No critical or high findings can be open at release time.

Production Monitor

Ongoing validation on schedule

quickScan, depth: 20, weekly

Scheduled scans confirm production has not drifted since deployment. Alert on new findings between releases.

Domain Intelligence

Providing domain context improves attack targeting. DiscoveR selects and weights adversarial prompts most relevant to your deployment context when you specify the domain and application purpose.

Domain Hint	Targets AI Applications In
finance	Banking, trading, credit scoring, insurance
healthcare	Clinical decision support, patient records, pharma
hr	Policy assistants, recruitment, performance management
legal	Contract review, matter management, compliance
ecommerce	Product search, recommendations, customer support
customer_service	Support bots, complaint handling, escalation
education	Tutoring, course Q&A, assessment systems
technology	Code assistants, DevOps agents, IT support
generic / other	General-purpose assistants; use domainNotes for custom context

Application Management API

Method	Signature	Returns
List applications	`sdk.redteam.get_applications()`	List[RedTeamApplication]
Get by ID	`sdk.redteam.get_application(app_id)`	RedTeamApplication
Create	`sdk.redteam.create_application(request)`	RedTeamApplication
Update	`sdk.redteam.update_application(app_id, request)`	RedTeamApplication
Delete	`sdk.redteam.delete_application(app_id)`	bool
Get metrics	`sdk.redteam.get_application_metrics(app_id)`	Dict (includes riskScore)
Wait for validation	`sdk.redteam.wait_for_application_validation(app_id)`	Blocks until ready

Use Cases

Use Cases by Industry

Security Assurance

Pre-release Security Gate

An AI product team runs DiscoveR as the final gate before every release. Comprehensive scan across all relevant categories. No critical finding can be open at release time. The CI pipeline enforces this automatically.

Financial Services

Continuous Compliance Evidence

A financial services firm demonstrates ongoing AI security testing to regulators. DiscoveR runs on a weekly schedule. Scan results feed into the compliance dashboard as evidence of continuous security validation.

Engineering

Regression Testing After Changes

Every update to a system prompt, guardrail, or model version triggers a DiscoveR rerun against the previous scan's attack prompts. The correlation chain tracks whether security posture improved or regressed.

Procurement

Third-party Model Evaluation

An enterprise evaluating two LLM providers runs identical DiscoveR scans against both. The comparison surfaces which provider has fewer exploitable weaknesses under the same attack categories and prompt budget.

Technical Reference

Technical Specifications

SDK package	`pip install mirror-sdk`
Python version	3.9 and above
API endpoint	`https://mirrorapi.azure-api.net/v1`
Authentication	`MIRROR_API_KEY` environment variable
Attack categories	8 built-in; BYOAP for custom datasets
Application types	REST API, Streaming SSE, WebSocket, Browser-based Web App
Transport protocols	REST, SSE, WebSocket, NDJSON
Provider presets	OpenAI, Anthropic, Ollama, custom (generic)
Max scan depth	150+ prompts; no hard upper limit
Budget distribution	Weighted: high-priority attacks ~70%, others ~30%
Scan chain tracking	correlation_id links all reruns to the original scan
Scan status values	PENDING, RUNNING, COMPLETED, FAILED, CANCELLED
SDK version	v1.0

Platform

How DiscoveR Fits with Other Mirror Products

DiscoveR

Security Testing Layer

This product. Finds weaknesses through automated red teaming before and after deployment.

AgentIQ

Runtime Behaviour Layer

Runtime guardrails that block the threats DiscoveR identified. Use DiscoveR to validate that AgentIQ policies are working as intended.

VectaX

Data Encryption Layer

Encrypts the data in the same applications DiscoveR tests for weaknesses. A complete deployment combines all three layers.

DiscoveR.