FuzzingBrain V2: Automated Vulnerability Discovery Through Multi-Agent LLM Systems

Software vulnerabilities threaten critical systems worldwide. With nearly 50,000 CVEs reported in 2025, security teams need automated tools that can discover and verify real vulnerabilities without drowning in false positives.

FuzzingBrain V2 solves this problem by combining Large Language Model reasoning with fuzzing-based verification. The system achieved 90% detection rate on the AIxCC 2025 dataset and discovered 41 zero-day vulnerabilities in real-world projects.

The Core Problem

Traditional vulnerability detection approaches face three critical gaps:

Gap 1: Unverifiable Results
LLM-based tools generate vulnerability reports but cannot prove they’re real. Security teams waste time investigating false positives.

Gap 2: Wrong Granularity
Function-level analysis misses bugs when context becomes extensive. Line-level analysis lacks sufficient context for accurate detection.

Gap 3: Complex Dependencies
Vulnerabilities spanning multiple functions with intricate data flows confuse existing approaches.

How FuzzingBrain V2 Works

Suspicious Points: The Key Innovation

FuzzingBrain V2 introduces “Suspicious Points” (SPs) - a novel abstraction that captures vulnerability-relevant code regions with control flow context. Instead of analyzing entire functions or individual lines, SPs focus on specific code patterns that could be vulnerable.

Each SP contains:

Function location and vulnerability description
Vulnerability type (buffer overflow, use-after-free, etc.)
Confidence score and verification status
Proof-of-concept guidance for reproduction

Multi-Agent Architecture

The system employs specialized agents working together:

Direction Generator: Divides codebases into logical business features
SP Generator: Identifies suspicious code patterns
SP Verifier: Performs deep analysis to filter false positives
PoC Generator: Crafts inputs that trigger vulnerabilities

Dual-Layer Fuzzing

Two fuzzing layers provide comprehensive coverage:

Global Fuzzer: Runs continuously for broad exploration
SP Fuzzer: Targets specific suspicious points for deep verification

Real-World Results

Competition Performance

On the AIxCC 2025 Final Competition dataset:

90% detection rate (36 of 40 vulnerabilities)
75% success on hard challenges requiring deep reasoning
First place among all competing systems

Zero-Day Discoveries

In real-world deployment across 19 open-source projects:

41 previously unknown vulnerabilities discovered
26 confirmed by maintainers
23 fixed with patches deployed

Notable discoveries include 6 vulnerabilities in OpenPrint CUPS and 4 each in fwupd and upx - mature, well-tested projects.

Technical Advantages

Logic-Driven Search

Rather than matching known vulnerability patterns, FuzzingBrain V2 analyzes business logic to identify suspicious behaviors. This enables discovery of novel vulnerability classes.

OSS-Fuzz Integration

Built on Google’s OSS-Fuzz infrastructure, the system:

Integrates with 1,000+ open-source projects
Guarantees 100% reproducibility for confirmed vulnerabilities
Automatically generates submission-ready reports

Context Engineering

The system uses the Model Context Protocol (MCP) for seamless tool integration, enabling agents to:

Perform static analysis across function boundaries
Trace complex data flows
Reason about program state and constraints

Complex Case Studies

FuzzingBrain V2 excels at discovering vulnerabilities that defeat traditional fuzzing:

Leap Second Bug: Required parsing responses with historical leap second timestamps (seconds=60) at call depth 10. The agent systematically tested different historical dates until finding one that bypassed validation.

Type Confusion: Exploited protocol bit collision requiring AES-256-CBC encryption with hardcoded keys. The agent reverse-engineered the encryption scheme from source code and implemented the three-phase handshake.

Implementation Efficiency

Successful vulnerability discovery averages:

12 minutes for Delta-scan challenges
18 minutes for Full-scan challenges
$35 average cost per discovered vulnerability

The system processes over 500,000 tokens across 40 challenges for under $1,800 total cost.

Getting Started

FuzzingBrain V2 operates in two modes:

Full-Scan Mode: Comprehensive analysis of entire codebases reachable by specific fuzzers.

Delta-Scan Mode: Focused analysis of code changes between versions (e.g., commits).

The system requires:

OSS-Fuzz compatible project structure
C/C++ source code (Java support via Jazzer)
Configured fuzzers and sanitizers

Future Applications

FuzzingBrain V2’s architecture enables several extensions:

Multi-language support for Rust, Go, and Python
Binary analysis through emulation-based fuzzing
Automatic patch generation for discovered vulnerabilities
Integration with CI/CD pipelines for continuous security testing

Key Takeaway

FuzzingBrain V2 demonstrates that LLM agents can effectively guide vulnerability discovery when grounded by concrete execution feedback and structured by well-defined analysis abstractions. The system bridges the gap between semantic code understanding and practical security verification.

For security teams seeking automated vulnerability discovery with high accuracy and low false positive rates, FuzzingBrain V2 offers a production-ready solution that scales across thousands of open-source projects.

FuzzingBrain V2: A Multi-Agent LLM System for Automated Vulnerability Discovery and Reproduction