Beating monitoring shows that AgentFlow automatically synthesizes multi-agent harnesses, using a typed graph DSL to unify the five-dimensional elements (roles, topology, message patterns, tool bindings, coordination protocols) into an editable graph program, with the outer loop using runtime signals to locate failures. The Chrome project, involving approximately 210 agents, 18 roles, and 192 parallel explorations, discovered 10 zero-days and 6 CVEs (including sandbox escapes) within 7 days. AgentFlow has been open-sourced.

BlockBeatNews

2026-04-23 06:51:01

Abstract generation in progress

According to Beating Monitoring, UCSB Feng Yu’s team, in collaboration with fuzz.land and other organizations, proposed AgentFlow, an automated system that synthesizes multiple agent harnesses (programs that coordinate agent roles, information transfer, tool allocation, and retry logic) for vulnerability discovery. The paper states that when the model remains unchanged, simply modifying the harness can improve success rates by several times, but existing solutions are mostly handcrafted or only search partial design spaces.

AgentFlow uses a typed graph DSL to unify the five dimensions of harnesses (roles, topology, message patterns, tool binding, coordination protocols) into an editable graph program, allowing step-by-step simultaneous modifications to agents, topology, prompts, and toolsets. The outer loop identifies failure points based on runtime signals such as target program coverage and sanitizer reports, replacing binary feedback of success/failure. On TerminalBench-2, combined with Claude Opus 4.6, it achieved 84.3% (75/89), the highest score among similar entries on that leaderboard.

On the Chrome codebase (35 million lines of C/C++), the system synthesized a harness containing 18 roles and approximately 210 agents, including 7 subsystem analyzers, 192 parallel explorers, and a four-stage crash classification pipeline, with dedicated agents like Crash Filter and Root Cause Analyzer deduplicating crashes using unique ASAN crash signatures. Running Kimi K2.5, an open-source model, on 192 H100 GPUs for 7 days, it discovered 10 zero-day vulnerabilities, all confirmed by Chrome VRP. Six have been assigned CVE numbers, involving WebCodecs, Proxy, Network, Codecs, and Rendering, with types including UAF, integer overflow, and heap buffer overflow, among which CVE-2026-5280 and CVE-2026-6297 are critical sandbox escape vulnerabilities.

Fuzz.land co-founder Shou Chaofan stated that some vulnerabilities were initially discovered using MiniMax M2.5, and MiniMax M2.5 along with Opus 4.6 can also find most of them. AgentFlow has been open-sourced.

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
Gate13thAnniversaryLive
1.21M Popularity
#
WCTCTradingChallengeShare8MUSDT
797.12K Popularity
#
BitcoinBouncesBack
211.85K Popularity
#
EthereumMemeSeasonReturns
2M Popularity
#
USIranTalksProgress
751.18K Popularity

Sitemap

AgentFlow automatically synthesizes multi-agent systems to uncover Chrome sandbox escape zero-day vulnerabilities

Trending Topics

Gate13thAnniversaryLive

WCTCTradingChallengeShare8MUSDT

BitcoinBouncesBack

EthereumMemeSeasonReturns

USIranTalksProgress

Pin