In narrow, well-defined use cases like yield optimization, agents have demonstrated performance superior to humans and bots. But for multi-faceted actions like trading, humans still outperform agents.

Among agents, model selection and risk management have the greatest impact on trading performance.

As agents are adopted at scale, there are multiple trust and execution risks, including wolf attacks, strategy congestion, and privacy trade-offs.

Agent Activity Continues to Grow

Over the past year, agent activity has steadily increased, with both trading volume and number of trades rising. We see Coinbase’s x402 protocol leading significant developments, with players like Visa, Stripe, and Google also launching their own standards. Most of the infrastructure currently being built aims to serve two scenarios: channels between agents or agent calls triggered by humans.

While stablecoin trading is widely supported, current infrastructure still relies on traditional payment gateways as the underlying layer, meaning it remains dependent on centralized counterparties. Therefore, the fully autonomous endpoint—where agents can self-finance, self-execute, and continuously optimize based on changing conditions—has not yet been realized.

Agent activity is not unfamiliar to DeFi. For years, on-chain protocols have employed automation via bots to capture MEV or extract excess profits that cannot be achieved without code. These systems perform very well under clearly defined parameters that do not change frequently or require additional oversight.

However, markets have become more complex over time. This is where the new generation of agents enters, as on-chain activity has become an experimental ground for such developments in recent months.

Agent Performance in Practice

According to reports, agent activity has grown exponentially, with over 17,000 agents launched since 2025. The total volume of automation/agent activity is estimated to cover over 19% of all on-chain activity. Not surprisingly, since over 76% of stablecoin transfers are estimated to be bot-generated, this indicates huge growth potential for agent activity in DeFi.

Agent autonomy spans a broad spectrum—from chatbots requiring high human oversight to agents capable of devising strategies that adapt to market conditions based on input goals. Compared to bots, agents have several key advantages, including the ability to respond and act on new information within milliseconds, and to scale coverage across thousands of markets while maintaining strict standards.

Most current agents are still at analyst or co-pilot levels, as they remain in testing phases.

Yield Optimization: Agents Perform Well

Liquidity provision is a domain where automation is already frequent, with total TVL held by agents exceeding $39 million. This figure mainly measures assets directly deposited into agents by users, excluding capital routed through vaults.

Giza Tech is one of the largest protocols in this space, having launched its first agent application, ARMA, at the end of last year, aimed at enhancing yield capture on major DeFi protocols. It has attracted over $19 million in managed assets and generated over $4 billion in agent trading volume.

The high ratio of trading volume to assets under management indicates that agents frequently rebalance capital, enabling higher yield capture. Once capital is deposited into the contract, execution is automated, providing users with a simple one-click experience that requires almost no supervision.

ARMA’s performance is measurable and excellent, generating over 9.75% annualized yield on USDC. Even after accounting for rebalancing fees and a 10% performance fee for the agent, returns still surpass those of ordinary lending on Aave or Morpho. Nonetheless, scalability remains a key issue, as these agents have not yet been tested in real-world scenarios at the scale of major DeFi protocols.

Trading: Humans Significantly Ahead

However, for more complex actions like trading, results are much more varied. Current trading models operate based on human-defined inputs and produce outputs according to preset rules. Machine learning extends this by enabling models to update their behavior based on new information without explicit reprogramming, pushing them into a co-pilot role. With fully autonomous agents joining, the trading landscape will undergo significant change.

Several competitions have been held between agents and humans, revealing large performance gaps. Trade XYZ hosted a human vs. agent stock trading contest on its platform. Each account started with $10k, with no leverage or trading frequency limits. The results overwhelmingly favored humans, with top human traders outperforming top agents by more than five times.

Meanwhile, Nof1 organized agent competitions among several models (Grok-4, GPT-5, Deepseek, Kimi, Qwen3, Claude, Gemini), testing different risk configurations from capital preservation to maximum leverage. The results highlighted several factors explaining performance differences:

Position Holding Time: Strongly correlated, with models holding positions for an average of 2-3 hours significantly outperforming those flipping positions frequently.

Expected Value: Measures whether the model’s trades are profitable on average. Interestingly, only the top three models had positive expected value, indicating most models’ losing trades outnumber profitable ones.

Leverage: Lower leverage levels, averaging 6-8x, performed better than models running with over 10x leverage, as high leverage accelerates losses.

Prompt Strategies: Monk Mode was the best-performing model so far, while Situational Awareness performed the worst. Based on model features, focusing on risk management and fewer external sources tends to yield better results.

Base Models: Grok 4.20 significantly outperformed other models by over 22% across different prompt strategies and was the only model with an average profit.

Other factors like long/short bias, trade size, and confidence scores lack sufficient data or have not shown positive correlation with performance. Overall, results suggest that agents tend to perform better within clearly defined constraints, indicating that human oversight remains crucial for goal configuration.

How to Evaluate Agents

Given that agents are still in early stages, there is no comprehensive evaluation framework yet. Historical performance is often used as a benchmark but is influenced by underlying factors that provide stronger signals of agent effectiveness.

Performance Under Different Volatility Conditions: Includes disciplined loss control during adverse conditions, indicating agents can recognize off-chain factors affecting profitability.

Transparency and Privacy: Both have trade-offs. Transparent agents that can be actively copied offer little strategic advantage. Private agents face risks of internal extraction by creators, who can easily front-run their users.

Information Sources: The data sources accessed by agents are critical for decision-making. Ensuring trusted, non-single dependency sources is essential.

Security: Having smart contract audits and proper custody architectures is vital to ensure backup measures during black swan events.

Next Steps for Agents

To enable large-scale adoption, much work remains on infrastructure. This boils down to key issues around trust and execution. Autonomous agents operate without safeguards, and there have been instances of poor fund management.

ERC-8004, launched in January 2026, became the first on-chain registry allowing autonomous agents to discover each other, establish verifiable reputation, and collaborate securely. This is a key unlock for DeFi composability, embedding trust scores directly into smart contracts, enabling permissionless interactions between agents and protocols.

However, this does not guarantee agents always operate in good faith, as collusion, reputation attacks, and wolf attacks remain possible security vulnerabilities. Significant room still exists for improvements in insurance, security, and economic staking of agents.

As agent activity expands in DeFi, strategy congestion becomes a structural risk. Yield farms are the clearest precedent; as strategies proliferate, returns compress. The same dynamic could apply to agent trading: if many agents train and optimize on similar data and targets, they will tend to converge on similar positions and exit signals.

The CoinAlg paper published by Cornell University in January 2026 formalized this issue. Transparent agents can be arbitraged because their trades are predictable and can be front-run. Private agents avoid this risk but introduce different risks, such as creators retaining informational advantages over their users and extracting value through opacity.

Agent activity will only accelerate, and the infrastructure laid today will determine how on-chain finance evolves in the next phase. As agent usage increases, they will self-iterate and become more attuned to user preferences. Therefore, the key differentiator will be trustworthy infrastructure, which will capture the largest market share.

USDC-0.02%

AAVE-2.49%

MORPHO-5.52%

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.

Reward
like
Comment
Repost
Share

Comment

Add a comment

No comments

Trending Topics
View More
#
Gate13thAnniversaryLive
1.21M Popularity
#
WCTCTradingChallengeShare8MUSDT
793.94K Popularity
#
BitcoinBouncesBack
213.98K Popularity
#
EthereumMemeSeasonReturns
2M Popularity
#
USIranTalksProgress
750.34K Popularity

Sitemap

DWF Deep Research Report: AI Optimizes Yield in DeFi Beyond Humans, Yet Still Lags by 5x in Complex Trading

Trending Topics

Gate13thAnniversaryLive

WCTCTradingChallengeShare8MUSDT

BitcoinBouncesBack

EthereumMemeSeasonReturns

USIranTalksProgress

Pin