Design Patterns for Securing LLM Agents Against Prompt Injection Attacks

Large Language Model (LLM) agents are transforming how we interact with software systems, but they introduce critical security vulnerabilities. Among the most dangerous threats are prompt injection attacks, where malicious instructions embedded in content manipulate an agent’s behavior to perform unauthorized actions.

While securing general-purpose agents remains challenging, researchers have identified six design patterns that can make application-specific agents resistant to prompt injection attacks.

The Core Security Challenge

Prompt injection attacks occur when adversaries introduce malicious instructions into content processed by an LLM. These attacks can lead to:

Data exfiltration and privilege escalation
Remote code execution through tool access
Manipulation of agent reasoning and outputs
Denial of service through resource exhaustion

Traditional security frameworks struggle with these attacks because LLMs process natural language input, making it difficult to distinguish between legitimate instructions and malicious injections.

Six Security Design Patterns

1. Action-Selector Pattern

The agent acts as a translator between natural language requests and predefined actions. It selects from a fixed list of allowed operations without processing feedback from those actions.

Example: A customer service chatbot with predetermined responses like “retrieve order status” or “redirect to password settings.”

Security benefit: The LLM never directly processes untrusted data, making prompt injection impossible.

2. Plan-Then-Execute Pattern

The agent creates a fixed execution plan before processing any untrusted data. While tool outputs may contain malicious content, they cannot alter the predetermined sequence of actions.

Example: An email assistant that commits to specific tool calls (read calendar, compose email) before executing them.

Security benefit: Provides control flow integrity—malicious data cannot change what actions the agent takes.

3. LLM Map-Reduce Pattern

The system dispatches isolated sub-agents to process individual pieces of untrusted data (map operation), then aggregates results through either non-LLM operations or LLMs with strict output constraints (reduce operation).

Example: A file analysis system that processes each document separately, returning only boolean results about whether files contain invoices.

Security benefit: Malicious content in one document cannot affect processing of other documents.

4. Dual LLM Pattern

This pattern separates privileged and quarantined LLM instances:

Privileged LLM: Receives instructions and can use tools, but never processes untrusted data
Quarantined LLM: Processes untrusted data but cannot use tools or take actions

Results from quarantined LLMs are stored symbolically and dereferenced only by traditional software components.

Security benefit: Untrusted data never reaches the decision-making LLM.

5. Code-Then-Execute Pattern

The agent writes a formal program to solve tasks, which may call tools and spawn unprivileged LLMs to process untrusted content. This generalizes the plan-then-execute pattern by replacing planning with explicit programming.

Security benefit: Provides the strongest isolation between untrusted data and agent control flow.

6. Context-Minimization Pattern

The system removes user prompts from the LLM’s context after processing initial requests, preventing prompt injections from affecting subsequent operations.

Example: A customer service system that translates user requests into database queries, then removes the original prompt before returning results.

Security benefit: Eliminates user prompt injections from downstream processing.

Real-World Applications

The researchers demonstrate these patterns through ten case studies:

OS Assistant: Uses dual LLM pattern to safely process file contents
SQL Agent: Applies code-then-execute to prevent database manipulation
Email Assistant: Combines plan-then-execute with user confirmation
Customer Service: Implements action-selector with context minimization
Medical Diagnosis: Uses context minimization to prevent manipulation of doctor responses

Implementation Trade-offs

Each pattern involves security-utility trade-offs:

High Security, Lower Utility: Action-selector and context-minimization patterns provide strong security but limit flexibility.

Balanced Approach: Plan-then-execute and map-reduce patterns maintain reasonable utility while providing meaningful security guarantees.

Complex but Powerful: Dual LLM and code-then-execute patterns offer the strongest security but require more sophisticated implementation.

Key Recommendations

Prioritize application-specific agents over general-purpose ones when security matters
Combine multiple patterns for robust defense—no single pattern addresses all threats
Define clear trust boundaries between privileged and unprivileged components
Implement traditional security practices like sandboxing and least privilege alongside these patterns

The Path Forward

While general-purpose agents that can solve arbitrary tasks remain vulnerable to prompt injection, these design patterns show that secure, useful AI agents are achievable today. The key is accepting intentional constraints on agent capabilities in exchange for security guarantees.

As LLM agents become more prevalent in critical applications, adopting these principled design patterns will be essential for safe deployment. The research provides a practical foundation for building AI systems that remain secure even when processing untrusted content.

Design Patterns for Securing LLM Agents Against Prompt Injection Attacks

Design Patterns for Securing LLM Agents Against Prompt Injection Attacks

The Core Security Challenge

Six Security Design Patterns

1. Action-Selector Pattern

2. Plan-Then-Execute Pattern

3. LLM Map-Reduce Pattern

4. Dual LLM Pattern

5. Code-Then-Execute Pattern

6. Context-Minimization Pattern

Real-World Applications

Implementation Trade-offs

Key Recommendations

The Path Forward

Introducing A2UI: An open project for agent-driven interfaces

AI Agents for Economic Research: A Comprehensive Guide to Building Autonomous Research Systems

Reverse Engineering Claude Code: The Secret Sauce Behind Better AI Coding Agents