Anthropic has unveiled the results of a study on how modern AI models can identify vulnerabilities in smart contracts. The developers tested Claude Sonnet 4.5, Claude Opus 4.5 and GPT-5 on the SCONE-bench set, which includes Ethereum and BNB Chain contract vulnerabilities from 2020-2025. During the tests, the models successfully simulated exploits for about half […] Сообщение Anthropic Trains Its AI Models to Detect Smart Contract Vulnerabilities, Uncovers $4.6 Million in “Hacks” появились сначала на INCRYPTED.Anthropic has unveiled the results of a study on how modern AI models can identify vulnerabilities in smart contracts. The developers tested Claude Sonnet 4.5, Claude Opus 4.5 and GPT-5 on the SCONE-bench set, which includes Ethereum and BNB Chain contract vulnerabilities from 2020-2025. During the tests, the models successfully simulated exploits for about half […] Сообщение Anthropic Trains Its AI Models to Detect Smart Contract Vulnerabilities, Uncovers $4.6 Million in “Hacks” появились сначала на INCRYPTED.

Anthropic Trains Its AI Models to Detect Smart Contract Vulnerabilities, Uncovers $4.6 Million in “Hacks”

  • Models have identified exploits in contracts that have already been compromised.
  • AI found 19 vulnerabilities after the knowledge cutoff date and two zero-day threats.
  • Anthropic releases open source security testing benchmark.

Anthropic has unveiled the results of a study on how modern AI models can identify vulnerabilities in smart contracts. The developers tested Claude Sonnet 4.5, Claude Opus 4.5 and GPT-5 on the SCONE-bench set, which includes Ethereum and BNB Chain contract vulnerabilities from 2020-2025.

During the tests, the models successfully simulated exploits for about half of the historical incidents. In terms of assets held in the affected contracts at the time of the attacks, the total notional valuation exceeded $550 million.

Vulnerability search results using various AI models. Data: Anthropic.

A separate block of tests included contracts hacked after March 2025, the model knowledge cutoff date. On this sample, AI agents identified 19 vulnerabilities out of 34, corresponding to an estimated value of about $4.6 million.

These cases were not known to the models in advance and included several new types of flaws, company officials said.

The Claude Opus 4.5 model performed best in the SCONE-bench benchmark tests. It generated exploits for 17 cases, which is 50% of the sample, and would potentially mean $4.5 million in notional “revenue.”

The older models — Claude Sonnet 4.5 and GPT-5 — along with Opus 4.5 were able to detect 19 vulnerabilities out of 34 contracts tested. That’s about 55.8% of the test suite and about $4.6 million in notional funds.

Anthropic also tested whether the AI could find previously unknown issues in recently deployed contracts. Two such zero-day vulnerabilities were found among the new addresses. This, experts said, showed the models’ ability to identify bugs without prior signals or historical data.

The company notes that the research is not aimed at exploiting vulnerabilities, but was created to develop tools to evaluate the ability of AI systems to recognize flaws in code. Anthropic plans to use SCONE-bench as an open standard for testing and comparing LLM capabilities.

The authors of the paper envision that such models can be applied to the development and auditing of smart contracts, helping to detect bugs before deployment to the network.

Anthropic also points out that the study does not reflect the full level of risk because the analysis is limited to a sample of historical contracts and a controlled environment. The company will continue to expand the benchmark and explore the use of AI tools to support teams working with blockchain protocol security.

Market Opportunity
null Logo
null Price(null)
--
----
USD
null (null) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.