Question 1

What is an AI agent and how is it different from a chatbot?

Accepted Answer

A chatbot receives a single message, generates a single response, and stops. An AI agent receives a goal, plans a sequence of steps, calls tools to gather information or take actions, processes the results, and continues iterating until the goal is achieved or a limit is reached. The defining difference is autonomy across multiple steps without a human confirming each one. An agent can browse the web, query databases, write files, and send API requests in a single session based on a single high-level instruction.

Question 2

What are the four components of an AI agent?

Accepted Answer

AI agents have four components. Perception is how the agent reads its environment: user instructions, tool outputs, retrieved documents, and any other input that enters the context window. Memory is how the agent stores information: in-context memory holds the current session up to the context limit, external memory is a persistent database the agent reads and writes across sessions, and episodic memory stores summaries of past sessions. Planning is how the agent decides what to do next, typically using techniques like ReAct (Reasoning and Acting) or Chain-of-Thought prompting to decompose a goal into ordered steps. Action is how the agent affects the world: by calling tools, writing files, making API requests, or sending messages.

Question 3

What are the three types of AI agent memory?

Accepted Answer

In-context memory is everything currently in the agent's context window: the system prompt, conversation history, tool outputs, and retrieved documents. It is fast to access but limited by the context window size and disappears when the session ends. External memory is a persistent store outside the context window, typically a vector database the agent can query and write to across multiple sessions. It survives session boundaries but requires an explicit retrieval step. Episodic memory stores summaries or embeddings of past sessions, allowing the agent to recall previous interactions and build on earlier work. External and episodic memory are both attack surfaces: if an attacker can write to them, they can influence future agent behaviour.

Question 4

How does tool calling work in an AI agent?

Accepted Answer

Tool calling is the mechanism by which an agent acts beyond generating text. The agent produces a structured JSON object specifying a function name and arguments. The runtime receives this, executes the function against the real system (web API, database, file system, shell), and returns the result as a new observation in the context window. The agent reads the observation and decides whether the goal is achieved or another tool call is needed. The security risk is that the agent cannot independently verify that the tool it is calling is the tool it believes it is calling, that the tool's output is genuine, or that calling the tool will not have harmful side effects.

Question 5

What is the difference between a single-agent and multi-agent system?

Accepted Answer

A single-agent system uses one LLM running a perception-planning-action loop repeatedly until the task is complete. A multi-agent system uses multiple LLMs that communicate through a shared state or message-passing protocol. Specialised agents handle different parts of a task: one agent might search the web, another might write code, a third might validate results. A hierarchical system adds an orchestrator that receives the top-level goal and delegates subtasks to specialist sub-agents. Multi-agent systems are more capable but introduce new attack surfaces: a compromised sub-agent can pass malicious instructions to the orchestrator, which then directs other sub-agents to take harmful actions.

Question 6

What are the main ways AI agents fail?

Accepted Answer

Agent failures fall into five categories. Planning failures happen when the agent decomposes a goal incorrectly, skips steps, or selects the wrong approach. Tool misuse happens when the agent calls the right tool with wrong arguments, or the wrong tool entirely, often because tool descriptions are ambiguous. Memory corruption happens when the agent's context contains stale, incorrect, or adversarially modified information that causes bad downstream decisions. Authority confusion happens when the agent cannot distinguish between instructions from its operator, its user, and content retrieved from the environment: a poisoned document can instruct the agent as if it were the operator. Runaway loops happen when the agent cannot recognise that a goal is impossible and continues taking actions indefinitely.

Question 7

Why are agent failures more dangerous than LLM failures?

Accepted Answer

An LLM failure produces a bad text response. A human reads it and decides whether to act on it. An agent failure takes real-world action without a human checkpoint. An agent that follows a malicious instruction can delete files, send emails, make API calls, or exfiltrate data before anyone notices. The failure also compounds: a wrong decision in step 3 of a 20-step task shapes all subsequent decisions. By step 15 the agent may be operating in a completely corrupted state without any single step looking obviously wrong. The surface area for attack is proportional to the number of tools the agent can call and the number of steps it takes.

Question 8

What is AgentIQ and where does it sit in an agent stack?

Accepted Answer

AgentIQ is Mirror Security's AI safety and compliance platform for securing agentic systems. It sits as a runtime guardrail layer between the agent and the world: it checks every input before it reaches the model (detecting prompt injection, PII, and jailbreak attempts), checks every output before it reaches a tool or downstream system (checking for hallucination, toxicity, and policy violations), and enforces policies defined in the Mirror Policy DSL. The primary integration is through the Mirror SDK (pip install mirror_sdk) using Python decorators or the programmatic API. AgentIQ operates with sub-200ms latency so it can run in-line on every agent turn.

Question 9

What is the ReAct pattern for AI agents?

Accepted Answer

ReAct stands for Reasoning and Acting. It is a prompting pattern that instructs the agent to explicitly reason about what to do before taking each action. The agent produces a Thought (what do I need to do and why), an Action (which tool to call and with what arguments), and then processes the Observation (what the tool returned) before producing the next Thought. This makes the agent's reasoning visible and inspectable. From a security perspective it also makes the agent's behaviour more predictable and easier to monitor: anomalous reasoning steps (for example, a Thought that includes instructions from retrieved content) are easier to detect in a structured ReAct trace than in a free-form generation.

Question 10

What is authority confusion in an AI agent?

Accepted Answer

Authority confusion is when an agent cannot distinguish between instructions from different sources and treats them all as equally authoritative. A well-designed agent has a clear trust hierarchy: the system prompt (operator instructions) has the highest authority, user messages have medium authority, and content retrieved from the environment (web pages, documents, tool outputs) has the lowest authority. Authority confusion happens when retrieved content contains instruction-like text that the agent follows as if it came from the operator. This is the mechanism behind indirect prompt injection attacks, covered in B2.

Question 11

What does AgentIQ's policy engine do?

Accepted Answer

AgentIQ's policy engine lets you define declarative rules that govern what an agent can receive, say, and do. Policies are written in the Mirror Policy DSL: deny rules block interactions that match a condition (for example, deny message input where check_prompt_injection() == true), allow rules explicitly permit interactions that might otherwise be blocked, and check statements evaluate outputs for hallucination, toxicity, bias, and PII. Tool call policies restrict which functions an agent can invoke (deny tool_call where function.name == 'dangerous_function'). Policies can be generated from plain English in the Policy Workbench or written manually in the DSL.

Question 12

What AgentIQ capabilities are covered in Track 2B?

Accepted Answer

Track 2B covers AgentIQ across all six modules. B1 (this module) introduces the platform and SDK setup. B2 covers prompt injection detection using sdk.agentiq.detect_prompt_injection and the prevent_injection policy. B3 covers tool call policies (deny tool_call, network_security, sql_security, file_security policies). B4 covers the unified safety API (sdk.safety.analyze), the @policy_monitor decorator, and check_output statements. B5 covers identity and least privilege in the context of agent credential scoping. B6 covers multi-agent trust policies. Full AgentIQ documentation is at platform.mirrorsecurity.io.

Agent Architecture
& How Agents Fail

What an AI agent actually is

The anatomy of an agent

Types of agent memory

Tool calling

Orchestration patterns

How agents fail

Why agent failures are different from LLM failures

AgentIQ: runtime guardrails for agents

Runtime guardrails for production AI agents

Agent Architecture& How Agents Fail