Alibaba Cloud has open-sourced its Qwen3-ASR and Qwen3-ForcedAligner AI models, delivering state-of-the-art speech recognition and forced alignment performance.Alibaba Cloud has open-sourced its Qwen3-ASR and Qwen3-ForcedAligner AI models, delivering state-of-the-art speech recognition and forced alignment performance.

Qwen Open-Sources Advanced ASR And Forced Alignment Models With Multi-Language Capabilities

2026/01/29 22:30
2 min read
For feedback or concerns regarding this content, please contact us at [email protected]
Qwen Open-Sources Advanced ASR And Forced Alignment Models With Multi-Language Capabilities

Alibaba Cloud announced that it has made its Qwen3-ASR and Qwen3-ForcedAligner AI models open-source, offering advanced tools for speech recognition and forced alignment. 

The Qwen3-ASR family includes two all-in-one models, Qwen3-ASR-1.7B and Qwen3-ASR-0.6B, which support language identification and transcription across 52 languages and accents, leveraging large-scale speech data and the Qwen3-Omni foundation model. 

Internal testing indicates that the 1.7B model delivers state-of-the-art accuracy among open-source ASR systems, while the 0.6B version balances performance and efficiency, capable of transcribing 2,000 seconds of speech in one second with high concurrency. 

The Qwen3-ForcedAligner-0.6B model uses a non-autoregressive LLM approach to align text and speech in 11 languages, outperforming leading force-alignment solutions in both speed and accuracy. 

Alibaba Cloud has also released a comprehensive inference framework under the Apache 2.0 license, supporting streaming, batch processing, timestamp prediction, and fine-tuning, aimed at accelerating research and practical applications in audio understanding.

Qwen3-ASR And Qwen3-ForcedAligner Models Demonstrate Leading Accuracy And Efficiency

Alibaba Cloud has released performance results for its Qwen3-ASR and Qwen3-ForcedAligner models, demonstrating leading accuracy and efficiency across diverse speech recognition tasks. 

The Qwen3-ASR-1.7B model achieves state-of-the-art results among open-source systems, outperforming commercial APIs and other open-source models in English, multilingual, and Chinese dialect recognition, including Cantonese and 22 regional variants. 

It maintains reliable accuracy in challenging acoustic conditions, such as low signal-to-noise environments, child or elderly speech, and even singing voice transcription, achieving average word error rates of 13.91% in Chinese and 14.60% in English with background music.

The smaller Qwen3-ASR-0.6B balances accuracy and efficiency, delivering high throughput and low latency under high concurrency, capable of transcribing up to five hours of speech in online asynchronous mode at a concurrency of 128. 

Meanwhile, the Qwen3-ForcedAligner-0.6B outperforms leading end-to-end forced alignment models including Nemo-Forced-Aligner, WhisperX, and Monotonic-Aligner, offering superior language coverage, timestamp accuracy, and support for varied speech and audio lengths.

The post Qwen Open-Sources Advanced ASR And Forced Alignment Models With Multi-Language Capabilities appeared first on Metaverse Post.

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge!

IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge!

The post IP Hits $11.75, HYPE Climbs to $55, BlockDAG Surpasses Both with $407M Presale Surge! appeared on BitcoinEthereumNews.com. Crypto News 17 September 2025 | 18:00 Discover why BlockDAG’s upcoming Awakening Testnet launch makes it the best crypto to buy today as Story (IP) price jumps to $11.75 and Hyperliquid hits new highs. Recent crypto market numbers show strength but also some limits. The Story (IP) price jump has been sharp, fueled by big buybacks and speculation, yet critics point out that revenue still lags far behind its valuation. The Hyperliquid (HYPE) price looks solid around the mid-$50s after a new all-time high, but questions remain about sustainability once the hype around USDH proposals cools down. So the obvious question is: why chase coins that are either stretched thin or at risk of retracing when you could back a network that’s already proving itself on the ground? That’s where BlockDAG comes in. While other chains are stuck dealing with validator congestion or outages, BlockDAG’s upcoming Awakening Testnet will be stress-testing its EVM-compatible smart chain with real miners before listing. For anyone looking for the best crypto coin to buy, the choice between waiting on fixes or joining live progress feels like an easy one. BlockDAG: Smart Chain Running Before Launch Ethereum continues to wrestle with gas congestion, and Solana is still known for network freezes, yet BlockDAG is already showing a different picture. Its upcoming Awakening Testnet, set to launch on September 25, isn’t just a demo; it’s a live rollout where the chain’s base protocols are being stress-tested with miners connected globally. EVM compatibility is active, account abstraction is built in, and tools like updated vesting contracts and Stratum integration are already functional. Instead of waiting for fixes like other networks, BlockDAG is proving its infrastructure in real time. What makes this even more important is that the technology is operational before the coin even hits exchanges. That…
Share
BitcoinEthereumNews2025/09/18 00:32
StakeStone STO Surges 128% in 24 Hours: What $955M Volume Tells Us

StakeStone STO Surges 128% in 24 Hours: What $955M Volume Tells Us

StakeStone's STO token recorded a staggering 128% price increase in 24 hours, accompanied by $955.8 million in trading volume—nearly seven times its $141 million
Share
Blockchainmagazine2026/04/02 18:06
Q2 Market Insights: Bitcoin regains dominance in risk-averse environment, ETFs remain critical to market structure

Q2 Market Insights: Bitcoin regains dominance in risk-averse environment, ETFs remain critical to market structure

The market will show a downward trend in the short term, and then rebound and set new highs in the second half of the year.
Share
PANews2025/04/28 19:40

$30,000 in PRL + 15,000 USDT

$30,000 in PRL + 15,000 USDT$30,000 in PRL + 15,000 USDT

Deposit & trade PRL to boost your rewards!