The post OpenAI Introduces Smart Contract Benchmark for AI Agents as AI and Crypto Converge appeared on BitcoinEthereumNews.com. OpenAI has introduced a new smartThe post OpenAI Introduces Smart Contract Benchmark for AI Agents as AI and Crypto Converge appeared on BitcoinEthereumNews.com. OpenAI has introduced a new smart

OpenAI Introduces Smart Contract Benchmark for AI Agents as AI and Crypto Converge

2026/02/19 07:37
Okuma süresi: 3 dk

OpenAI has introduced a new smart contract security benchmark as AI agents gain stronger coding abilities in the crypto sector. Together with Paradigm, OpenAI said the benchmark, called EVMbench, tests how AI systems detect, patch, and exploit serious Ethereum contract bugs. Their effort responds to the growing financial risk, since smart contracts routinely secure over $100 billion in open-source crypto assets.

OpenAI Smart Contract Benchmark Targets Real Audit Vulnerabilities

In their release, OpenAI said EVMbench draws on 120 curated vulnerabilities collected from 40 professional smart contract audits. Notably, most of the issues came from open audit competitions, including Code4rena. OpenAI said the benchmark also includes vulnerability scenarios tied to security auditing work for the Tempo blockchain.

Tempo is described as a purpose-built Layer-1 network designed for high-throughput, low-cost stablecoin payments. Because of that, these scenarios extend the benchmark into payment-focused contract code. The company also said it expects agent-based stablecoin payment activity to grow.

To build the benchmark environments, OpenAI said it adapted existing exploit proof-of-concept tests and deployment scripts when available. However, it said engineers manually wrote missing components when no scripts existed. OpenAI added that it ensured patch tasks remained exploitable while still fixable without breaking compilation.

Detect, Patch, Exploit Modes Test AI Agents Under Pressure

OpenAI said EVMbench evaluates artificial intelligence agents in three modes. That is detect, patch, and exploit. In detect mode, agents audit smart contract repositories and get scored on recall of confirmed vulnerabilities and audit rewards. In patch mode, agents must modify vulnerable contracts while keeping intended functionality intact.

Exploit mode, however, focuses on full end-to-end fund draining attacks in a sandbox blockchain environment. The company said graders verify results using transaction replay and on-chain checks. To support reproducible evaluation, the company said it developed a Rust-based harness to deploy contracts and replay transactions deterministically.

Notably, the exploit tasks run in an isolated local Anvil environment instead of live crypto networks. It also said vulnerabilities used in the benchmark are historical and publicly documented. OpenAI added that the harness restricts unsafe RPC methods to limit abuse.

In exploit testing, OpenAI said GPT-5.3-Codex running via Codex CLI scored 72.2%. However, it said the earlier GPT-5 model scored 31.9%, despite being released just over six months earlier. OpenAI also noted that detect recall and patch success remain below full coverage.

OpenAI Adds New Talent with Agent Hire

While OpenAI pushed EVMbench into public view, it also expanded its agent development team. Notably, they hired Peter Steinberger, founder of the viral open-source AI agent project OpenClaw, previously known as Clawdbot. Sam Altman confirmed on X that Steinberger will join OpenAI to lead work on the “next generation of personal agents.”

Meanwhile, Altman said OpenClaw will transition into a foundation model project supported by OpenAI. The open-source project will continue under that structure, according to the announcement. The hiring drew wide attention as OpenAI increases its focus on autonomous and personal AI agents.

Source: https://coingape.com/openai-introduces-smart-contract-benchmark-for-ai-agents-as-ai-and-crypto-converge/

Piyasa Fırsatı
Smart Blockchain Logosu
Smart Blockchain Fiyatı(SMART)
$0.004496
$0.004496$0.004496
+1.97%
USD
Smart Blockchain (SMART) Canlı Fiyat Grafiği
Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen [email protected] ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.