Inception Labs has launched Mercury 2, a diffusion-based reasoning model capable of generating over 1,000 tokens per second, three times faster than comparable Inception Labs has launched Mercury 2, a diffusion-based reasoning model capable of generating over 1,000 tokens per second, three times faster than comparable

Inception Labs Launches Mercury 2, Diffusion-Based Reasoning Model Achieving Over 1,000 Tokens Per Second

2026/02/26 17:38
2 min read
Inception Labs Unveils Mercury 2: A Diffusion-Based LLM Delivering Over 1,000 Tokens Per Second For Low-Latency AI Applications

Inception Labs, an AI startup, has launched Mercury 2, a diffusion-based Large Language Model (LLM) designed to significantly accelerate reasoning tasks in production AI applications. 

Unlike traditional autoregressive models that generate text sequentially, Mercury 2 uses a parallel refinement process, producing multiple tokens simultaneously and converging over a small number of steps, enabling speeds of over 1,000 tokens per second on NVIDIA Blackwell GPUs—approximately three times faster than competing models in the same price range.

The model is optimized for real-time responsiveness in complex AI workflows, where latency compounds across multiple inference calls, retrieval pipelines, and agentic loops. Mercury 2 maintains high reasoning quality while reducing latency, allowing developers, voice AI systems, search engines, and other interactive applications to operate at reasoning-grade performance without the delays associated with sequential generation. It supports features such as tunable reasoning, 128K token context windows, schema-aligned JSON output, and native tool integration, providing flexibility for a range of production deployments.

Mercury 2 Enables Low-Latency AI Across Coding, Voice, And Search Workflows 

The report highlights several use cases where low-latency reasoning is critical. In coding and editing workflows, Mercury 2 delivers rapid autocomplete and next-edit suggestions that integrate seamlessly with developers’ thought processes. In agentic workflows, the model allows for more inference steps without exceeding latency budgets, improving the quality and depth of automated decision-making. Voice-based AI and interactive applications benefit from its ability to generate reasoning-quality responses within natural speech cadences, enhancing user experiences in real-time conversation scenarios. Additionally, Mercury 2 supports multi-hop search and retrieval pipelines, enabling rapid summarization, reranking, and reasoning without compromising response times.

Early adopters have noted significant improvements in throughput and user experience. Mercury 2 has been described as at least twice as fast as GPT-5.2 while maintaining competitive quality, with applications spanning real-time transcript cleanup, interactive human-computer interfaces, autonomous advertising optimization, and voice-enabled AI avatars.

The model is compatible with the OpenAI API, allowing integration into existing stacks without extensive modification, and Inception Labs offers support for enterprise evaluations, performance validation, and workload-specific deployment guidance. Mercury 2 represents a step forward in diffusion-based LLMs, redefining the balance between reasoning quality and latency in production AI environments.

The post Inception Labs Launches Mercury 2, Diffusion-Based Reasoning Model Achieving Over 1,000 Tokens Per Second appeared first on Metaverse Post.

Market Opportunity
Ucan fix life in1day Logo
Ucan fix life in1day Price(1)
$0.000654
$0.000654$0.000654
-12.76%
USD
Ucan fix life in1day (1) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

The Manchester City Donnarumma Doubters Have Missed Something Huge

The Manchester City Donnarumma Doubters Have Missed Something Huge

The post The Manchester City Donnarumma Doubters Have Missed Something Huge appeared on BitcoinEthereumNews.com. MANCHESTER, ENGLAND – SEPTEMBER 14: Gianluigi Donnarumma of Manchester City celebrates the second City goal during the Premier League match between Manchester City and Manchester United at Etihad Stadium on September 14, 2025 in Manchester, England. (Photo by Visionhaus/Getty Images) Visionhaus/Getty Images For a goalkeeper who’d played an influential role in the club’s first-ever Champions League triumph, it was strange to see Gianluigi Donnarumma so easily discarded. Soccer is a brutal game, but the sudden, drastic demotion of the Italian from Paris Saint-Germain’s lineup for the UEFA Super Cup clash against Tottenham Hotspur before he was sold to Manchester City was shockingly brutal. Coach Luis Enrique isn’t a man who minces his words, so he was blunt when asked about the decision on social media. “I am supported by my club and we are trying to find the best solution,” he told a news conference. “It is a difficult decision. I only have praise for Donnarumma. He is one of the very best goalkeepers out there and an even better man. “But we were looking for a different profile. It’s very difficult to take these types of decisions.” The last line has really stuck, especially since it became clear that Manchester City was Donnarumma’s next destination. Pep Guardiola, under whom the Italian will be playing this season, is known for brutally axing goalkeepers he didn’t feel fit his profile. The most notorious was Joe Hart, who was jettisoned many years ago for very similar reasons to Enrique. So how can it be that the Catalan coach is turning once again to a so-called old-school keeper? Well, the truth, as so often the case, is not quite that simple. As Italian soccer expert James Horncastle pointed out in The Athletic, Enrique’s focus on needing a “different profile” is overblown. Lucas Chevalier,…
Share
BitcoinEthereumNews2025/09/18 07:38
“We Cannot in Good Conscience Agree”: Anthropic Defies Pentagon Over AI Weapons

“We Cannot in Good Conscience Agree”: Anthropic Defies Pentagon Over AI Weapons

TLDR The Pentagon is demanding Anthropic remove safety guardrails from its Claude AI so it can be used for any lawful purpose, including autonomous weapons and
Share
Coincentral2026/02/27 20:18
Wormhole Unleashes W 2.0 Tokenomics for a Connected Blockchain Future

Wormhole Unleashes W 2.0 Tokenomics for a Connected Blockchain Future

TLDR Wormhole reinvents W Tokenomics with Reserve, yield, and unlock upgrades. W Tokenomics: 4% yield, bi-weekly unlocks, and a sustainable Reserve Wormhole shifts to long-term value with treasury, yield, and smoother unlocks. Stakers earn 4% base yield as Wormhole optimizes unlocks for stability. Wormhole’s new Tokenomics align growth, yield, and stability for W holders. Wormhole [...] The post Wormhole Unleashes W 2.0 Tokenomics for a Connected Blockchain Future appeared first on CoinCentral.
Share
Coincentral2025/09/18 02:07