Vitalik Buterin just published a research proposal that sidesteps the question everyone keeps asking: can blockchains run AI models?
Instead, the research claims Ethereum as the privacy-preserving settlement layer for metered AI and API usage. The post, co-authored with Davide Crapis on Ethereum Research, argues that the real opportunity isn't putting LLMs on-chain.
The real opportunity lies in building the infrastructure that enables agents and users to pay for thousands of API calls without compromising identity or creating surveillance trails through billing data.
The timing is critical because agentic AI is moving from demonstrations to enterprise roadmaps. Gartner forecasts that 40% of enterprise applications will include task-specific AI agents by the end of 2026, up from under 5% in 2025.
Gartner forecasts enterprise applications with task-specific AI agents will jump from under 5% in 2025 to 40% by end of 2026.
That shift implies a world in which software autonomously generates massive volumes of API calls, making billing rails strategic infrastructure rather than back-office plumbing.
Current metering systems force a choice between Web2 identity billing, which relies on API keys and credit cards and leaks profiling data, and on-chain pay-per-call models that are too slow, too expensive, and link activity through transparent transaction graphs.
The proposal introduces ZK API usage credits, a payment and anti-abuse primitive built on Rate-Limiting Nullifiers.
RLN is a zero-knowledge gadget designed to prevent spam in anonymous systems, and the research repurposes it for metered access to services.
The flow proceeds as follows: users deposit funds once into a smart contract, and their commitment is added to an on-chain Merkle tree.
Each API request includes a zero-knowledge proof demonstrating that the user is a valid depositor with sufficient credit for the requested index.
If a user attempts to reuse a ticket index, double-spending their allowance, RLN allows the system to recover their secret and slash their stake as an economic penalty.
The post includes concrete examples. A user deposits 100 USDC and makes 500 hosted LLM queries. Another deposits 10 USDC for 10,000 Ethereum RPC calls.
The architecture is explicitly designed for “many calls per deposit,” meaning that on-chain activity scales with the number of accounts and settlement frequency rather than raw inference volume.
Variable-cost support adds flexibility: users prepay a maximum cost per call, servers return signed refund tickets for unused amounts, and users privately accumulate refunds to unlock more calls without additional deposits.
The proposal arrives when the payment substrate for usage credits already exists at scale.
Stablecoins have a circulating market cap of approximately $307.6 billion, according to DefiLlama, indicating that the on-chain dollar layer is sufficiently liquid to support deposit-based billing for high-frequency services.
Ethereum's scaling stack has matured to the point where rollups process far more activity than layer-1, with L2Beat showing a roughly 100x scaling factor, with rollups handling thousands of operations per second compared to tens on the Ethereum mainnet.
Average Ethereum transaction fees recently measured around $0.21 on Feb. 7, suggesting that occasional on-chain metering and settlement flows are feasible without prohibitive cost.
The design explicitly avoids putting LLMs on-chain. Ethereum competes on neutral settlement, programmable escrow, and verifiable enforcement, not TPU cycles or inference speed.
The architecture treats inference as an off-chain service and the blockchain as the layer that makes payment, metering, and dispute resolution credible, without requiring users to trust individual providers or to reveal their identities.
If AI service providers accept deposits and rely on Ethereum or layer 2 smart contracts to adjudicate slashing, refunds, and disputes, Ethereum becomes the enforcement layer for AI commerce.
The model parallels how Ethereum became the settlement layer for stablecoins and DeFi, not by hosting the full application stack on-chain, but by providing a neutral substrate where economic agreements are enforced programmatically.
The on-chain footprint is bounded by settlement cadence, not raw call volume.
In a crypto-native wedge scenario targeting RPC and infrastructure APIs, suppose 250,000 power users or agents adopt usage credits.
If each performs two on-chain actions per month, a deposit or top-up plus a withdrawal, that generates roughly 500,000 transactions monthly attributable to the rail.
In an AI provider adoption scenario, imagine one million users employ privacy-preserving credits across hosted LLM services but still perform only one to three on-chain actions monthly.
That implies one million to three million transactions per month tied to AI commerce rails, likely concentrated on layer 2s where execution is cheaper.
Enterprise agent scenarios increase deposit sizes, raising the stakes for credible enforcement and making slashing mechanisms more consequential.
The proposal tries to make payments unlinkable, but the research thread itself highlights a potential weakness.
A commenter argues that even if nullifiers are cryptographically unlinkable, servers can correlate users through inference-based metadata such as timing patterns, token counts, and cache hits.
The critique proposes bucketed pricing, with fixed input and output classes, to reduce leakage. That tension between cryptographic privacy and behavioral metadata is central to whether the design actually delivers on its anonymity goals.
Implementation reality presents another hurdle. The proposal uses RLN as a primitive, but the Privacy and Scaling Explorations project page notes that RLN is inactive or has been sunset.
Productionizing ZK API usage credits likely requires maintaining forks or implementing new solutions rather than relying on existing tooling.
RLNJS benchmarks report roughly 800 milliseconds for proof generation and 130 milliseconds for verification on an M2 Mac, providing an early sanity check on performance but leaving open questions about mobile constraints and production-grade circuits at scale.
The proposal also assumes that providers will integrate the deposit-and-proof flow, accept stablecoin settlements, and adopt Ethereum or layer 2 contracts for dispute resolution.
That's a coordination problem, not just a technical one. Web2 API providers have existing billing infrastructure and regulatory clarity around identity-linked transactions.
Convincing them to adopt a ZK-based alternative requires demonstrating either a compelling cost advantage or a differentiated market segment in which privacy-preserving billing unlocks revenue they could not otherwise capture.
| Model | How it bills | What it leaks/breaks | Who it suits |
|---|---|---|---|
| Web2 identity billing (API keys + cards) | Account-based billing tied to identity (API key + payment method); provider meters requests and invoices centrally | Leaks: identity linkage + profiling trails across requests. Breaks: pseudonymity/self-custody norms. Risk: centralized control (suspension/censorship, single-provider trust) | Mainstream SaaS/API providers; enterprises prioritizing compliance, simplicity, and existing billing rails |
| Onchain pay-per-call | Each request (or batch) pays onchain per call via transactions/smart contracts | Breaks: cost/latency for high-frequency calls. Leaks: onchain linkability (transaction graph ties usage together). Friction: UX overhead for repeated txs | Crypto-native services with low call frequency; cases where transparency/auditability is more important than privacy/throughput |
| ZK API usage credits (deposit once, many calls) | User deposits once; each request carries a ZK proof of membership + remaining credit; slashing for double-use; optional refund tickets for variable cost | Risk: metadata correlation (timing/token patterns can re-link). Burden: provider integration + coordination. Maturity: ZK tooling/ops complexity, circuit maintenance | High-frequency APIs (LLMs, RPC, data) where privacy is a selling point; agent toolchains; users needing metering without identity-based surveillance |
If the design gains traction, Ethereum's value proposition shifts further toward serving as a neutral enforcement layer for digital commerce rather than a general-purpose computing platform.
The proposal treats blockchain as the settlement substrate where economic rules get enforced credibly, not the place where applications run.
Stablecoin velocity could rise as deposits flow into usage credit contracts, creating a new category of on-chain economic activity distinct from DeFi speculation or NFT trading.
Layer 2 utilization could increase as providers and users resolve disputes, process refunds, and handle slashing events on throughput-optimized chains.
ZK API usage credits generate onchain activity bounded by settlement frequency, not call volume, with scenarios ranging from 0.5 million to 3 million monthly transactions.
The question is whether a parallel ecosystem emerges in which privacy-preserving billing becomes a prerequisite for certain user segments.
Enterprises concerned about data leakage through billing logs, developers building agent toolchains that require auditable metering without surveillance, and power users who value pseudonymous access to high-frequency services are all potential early adopters.
Ethereum's opportunity is to serve as the layer on which AI service markets settle, without requiring participants to trust individual platforms or to sacrifice privacy to billing infrastructure.
The proposal claims Ethereum can enforce payment agreements, adjudicate disputes, and enable metered access without identity linkage in ways that traditional systems structurally cannot.
Whether that claim holds depends on solving the metadata correlation problem, maintaining robust ZK implementations, and convincing providers that the market justifies the integration cost it unlocks.
The post Vitalik Buterin pitches Ethereum as the AI settlement layer, but one hidden leak could ruin it appeared first on CryptoSlate.
