TLDR DeepSeek introduced Manifold-Constrained Hyper-Connections (mHC) to improve large-model training scalability and efficiency. The mHC method was tested on 3BTLDR DeepSeek introduced Manifold-Constrained Hyper-Connections (mHC) to improve large-model training scalability and efficiency. The mHC method was tested on 3B

DeepSeek Introduces mHC Architecture to Improve Large Model Training

2026/01/02 00:43
3 min read
For feedback or concerns regarding this content, please contact us at [email protected]

TLDR

  • DeepSeek introduced Manifold-Constrained Hyper-Connections (mHC) to improve large-model training scalability and efficiency.
  • The mHC method was tested on 3B, 9B, and 27B parameter models, showing stable performance without added computational cost.
  • mHC builds on ByteDance’s 2024 hyper-connection architecture by adding a manifold constraint to reduce memory overhead.
  • CEO Liang Wenfeng co-authored and uploaded the paper, reaffirming his direct involvement in DeepSeek’s technical development.
  • Industry observers expect a new DeepSeek model release ahead of Spring Festival 2026, based on the company’s publication patterns.

DeepSeek has released a new AI training method, Manifold-Constrained Hyper-Connections (mHC), in a paper uploaded to arXiv by CEO Liang Wenfeng. The architecture aims to improve training scalability for large models while keeping computational costs low. Researchers tested the method on models with 3, 9, and 27 billion parameters, showing consistent training efficiency. This comes as the company is expected to launch a new model before the Spring Festival in February 2026.

DeepSeek Builds on ResNet and Hyper-Connection Foundations

According to a report by SCMP, the mHC method enhances earlier hyper-connection (HC) designs first proposed by ByteDance in 2024 as an improvement to ResNet. ResNet allows deeper neural networks by preserving signal strength across layers, but faces challenges in maintaining efficient learning at large scale. ByteDance’s HC improved signal flow but didn’t fully address memory usage in larger models.

DeepSeek introduced a manifold constraint to limit expansion and better control memory and compute costs during training. This adjustment preserved the HC benefits while making the network suitable for larger training tasks. Researchers wrote that mHC maintained performance without increasing computational overhead per unit during model training at scale.

Lead authors Zhenda Xie, Yixuan Wei, and Huanqi Cao explained that the system enables stable deep learning without collapse. They confirmed mHC works with minimal infrastructure adjustments, making it efficient for broader deployment. The architecture was tested across multiple model sizes, confirming the technique’s adaptability and reliability. DeepSeek reported that the method handled signal preservation and scalability better than previous HC-based frameworks.

Liang Wenfeng Directly Leads Technical Advancement

CEO Liang Wenfeng was listed as the final author and uploaded the paper himself, continuing his role in major DeepSeek research. He has consistently shared technical papers linked to the company’s top models, such as R1 and V3 on arXiv. Other researchers typically upload supporting studies not directly tied to product development.

His involvement in this paper signals continued leadership in the company’s core AI work. The release underscores DeepSeek’s approach of linking internal research closely with future product direction. Florian Brand, a PhD researcher at Trier University, said DeepSeek papers often indicate what models are coming next.

He noted that the R1 model followed a similar pattern of publication and then launch. Liang’s involvement has again drawn attention from analysts watching DeepSeek’s release schedule. The company has not announced a date, but its publication strategy has become predictable. DeepSeek has remained quiet on details, but research uploads suggest new systems are under development.

The post DeepSeek Introduces mHC Architecture to Improve Large Model Training appeared first on Blockonomi.

Market Opportunity
Hyperlane Logo
Hyperlane Price(HYPER)
$0.08448
$0.08448$0.08448
+1.91%
USD
Hyperlane (HYPER) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Today’s Biggest Crypto Movers: Dogecoin Leads the Pack

Today’s Biggest Crypto Movers: Dogecoin Leads the Pack

Today's Biggest Crypto Movers: Dogecoin Leads the Pack 🚀 Crypto Markets Heat Up Today Major cryptocurrencies are showing strong gains. Let's dive into today's top
Share
Blockchainmagazine2026/04/03 13:00
RWA Boom Accelerates As Tokenized Assets Hit New Highs In Early 2026

RWA Boom Accelerates As Tokenized Assets Hit New Highs In Early 2026

RWA distributed value rose from about $21B to $27.5B in Q1 2026, a gain of roughly 30%. Tokenized US Treasuries reached about $10B, creating an on-chain yield base
Share
LiveBitcoinNews2026/04/03 13:00
Cryptos Signal Divergence Ahead of Fed Rate Decision

Cryptos Signal Divergence Ahead of Fed Rate Decision

The post Cryptos Signal Divergence Ahead of Fed Rate Decision appeared on BitcoinEthereumNews.com. Crypto assets send conflicting signals ahead of the Federal Reserve’s September rate decision. On-chain data reveals a clear decrease in Bitcoin and Ethereum flowing into centralized exchanges, but a sharp increase in altcoin inflows. The findings come from a Tuesday report by CryptoQuant, an on-chain data platform. The firm’s data shows a stark divergence in coin volume, which has been observed in movements onto centralized exchanges over the past few weeks. Bitcoin and Ethereum Inflows Drop to Multi-Month Lows Sponsored Sponsored Bitcoin has seen a dramatic drop in exchange inflows, with the 7-day moving average plummeting to 25,000 BTC, its lowest level in over a year. The average deposit per transaction has fallen to 0.57 BTC as of September. This suggests that smaller retail investors, rather than large-scale whales, are responsible for the recent cash-outs. Ethereum is showing a similar trend, with its daily exchange inflows decreasing to a two-month low. CryptoQuant reported that the 7-day moving average for ETH deposits on exchanges is around 783,000 ETH, the lowest in two months. Other Altcoins See Renewed Selling Pressure In contrast, other altcoin deposit activity on exchanges has surged. The number of altcoin deposit transactions on centralized exchanges was quite steady in May and June of this year, maintaining a 7-day moving average of about 20,000 to 30,000. Recently, however, that figure has jumped to 55,000 transactions. Altcoins: Exchange Inflow Transaction Count. Source: CryptoQuant CryptoQuant projects that altcoins, given their increased inflow activity, could face relatively higher selling pressure compared to BTC and ETH. Meanwhile, the balance of stablecoins on exchanges—a key indicator of potential buying pressure—has increased significantly. The report notes that the exchange USDT balance, around $273 million in April, grew to $379 million by August 31, marking a new yearly high. CryptoQuant interprets this surge as a reflection of…
Share
BitcoinEthereumNews2025/09/18 01:01

Trade GOLD, Share 1,000,000 USDT

Trade GOLD, Share 1,000,000 USDTTrade GOLD, Share 1,000,000 USDT

0 fees, up to 1,000x leverage, deep liquidity