The post NVIDIA NCCL 2.28 Revolutionizes GPU Communication with New Device API appeared on BitcoinEthereumNews.com. Rebeca Moen Nov 10, 2025 23:56 NVIDIA’s latest NCCL 2.28 release introduces a device API, enhancing communication and computation fusion for GPU networks, boosting performance and efficiency. The NVIDIA Collective Communications Library (NCCL) has introduced its latest version, NCCL 2.28, a significant leap forward in GPU communication technology. This update focuses on the fusion of communication and computation, aiming to enhance throughput, reduce latency, and maximize GPU utilization across multi-GPU and multi-node systems, according to NVIDIA. Key Features of NCCL 2.28 NCCL 2.28 brings several new features, including GPU-initiated networking, device APIs for communication-compute fusion, and copy-engine-based collectives. These innovations are designed to empower developers to create efficient, scalable distributed applications. The release also includes expanded APIs, improved tooling, and cleaner integration paths, facilitating the development of custom communication kernels. Device API and Copy Engine Collectives The new device API allows for the development of custom device kernels that integrate communication within NVIDIA CUDA kernels, removing the need for host-initiated operations. This integration reduces synchronization overhead, thus increasing throughput and reducing latency. Three operation modes are introduced: Load/Store Accessible (LSA), Multimem, and GPU Initiated Networking (GIN), each supporting different communication scenarios. Moreover, the copy engine-based collectives enable efficient NVLink transfers by offloading communication tasks from streaming multiprocessors (SMs) to dedicated hardware. This approach minimizes resource contention, allowing simultaneous execution of communication and computation tasks. NCCL Inspector for Enhanced Profiling The NCCL Inspector, a new profiling tool, provides always-on observability and analysis of NCCL communication patterns. It offers detailed performance and metadata logging, allowing developers to analyze and debug collective operations efficiently. The plugin tracks each NCCL communicator individually, offering insights into performance patterns across different communication contexts. Developer Experience Improvements NCCL 2.28 enhances the developer experience with new APIs for operations like… The post NVIDIA NCCL 2.28 Revolutionizes GPU Communication with New Device API appeared on BitcoinEthereumNews.com. Rebeca Moen Nov 10, 2025 23:56 NVIDIA’s latest NCCL 2.28 release introduces a device API, enhancing communication and computation fusion for GPU networks, boosting performance and efficiency. The NVIDIA Collective Communications Library (NCCL) has introduced its latest version, NCCL 2.28, a significant leap forward in GPU communication technology. This update focuses on the fusion of communication and computation, aiming to enhance throughput, reduce latency, and maximize GPU utilization across multi-GPU and multi-node systems, according to NVIDIA. Key Features of NCCL 2.28 NCCL 2.28 brings several new features, including GPU-initiated networking, device APIs for communication-compute fusion, and copy-engine-based collectives. These innovations are designed to empower developers to create efficient, scalable distributed applications. The release also includes expanded APIs, improved tooling, and cleaner integration paths, facilitating the development of custom communication kernels. Device API and Copy Engine Collectives The new device API allows for the development of custom device kernels that integrate communication within NVIDIA CUDA kernels, removing the need for host-initiated operations. This integration reduces synchronization overhead, thus increasing throughput and reducing latency. Three operation modes are introduced: Load/Store Accessible (LSA), Multimem, and GPU Initiated Networking (GIN), each supporting different communication scenarios. Moreover, the copy engine-based collectives enable efficient NVLink transfers by offloading communication tasks from streaming multiprocessors (SMs) to dedicated hardware. This approach minimizes resource contention, allowing simultaneous execution of communication and computation tasks. NCCL Inspector for Enhanced Profiling The NCCL Inspector, a new profiling tool, provides always-on observability and analysis of NCCL communication patterns. It offers detailed performance and metadata logging, allowing developers to analyze and debug collective operations efficiently. The plugin tracks each NCCL communicator individually, offering insights into performance patterns across different communication contexts. Developer Experience Improvements NCCL 2.28 enhances the developer experience with new APIs for operations like…

NVIDIA NCCL 2.28 Revolutionizes GPU Communication with New Device API

For feedback or concerns regarding this content, please contact us at [email protected]


Rebeca Moen
Nov 10, 2025 23:56

NVIDIA’s latest NCCL 2.28 release introduces a device API, enhancing communication and computation fusion for GPU networks, boosting performance and efficiency.

The NVIDIA Collective Communications Library (NCCL) has introduced its latest version, NCCL 2.28, a significant leap forward in GPU communication technology. This update focuses on the fusion of communication and computation, aiming to enhance throughput, reduce latency, and maximize GPU utilization across multi-GPU and multi-node systems, according to NVIDIA.

Key Features of NCCL 2.28

NCCL 2.28 brings several new features, including GPU-initiated networking, device APIs for communication-compute fusion, and copy-engine-based collectives. These innovations are designed to empower developers to create efficient, scalable distributed applications. The release also includes expanded APIs, improved tooling, and cleaner integration paths, facilitating the development of custom communication kernels.

Device API and Copy Engine Collectives

The new device API allows for the development of custom device kernels that integrate communication within NVIDIA CUDA kernels, removing the need for host-initiated operations. This integration reduces synchronization overhead, thus increasing throughput and reducing latency. Three operation modes are introduced: Load/Store Accessible (LSA), Multimem, and GPU Initiated Networking (GIN), each supporting different communication scenarios.

Moreover, the copy engine-based collectives enable efficient NVLink transfers by offloading communication tasks from streaming multiprocessors (SMs) to dedicated hardware. This approach minimizes resource contention, allowing simultaneous execution of communication and computation tasks.

NCCL Inspector for Enhanced Profiling

The NCCL Inspector, a new profiling tool, provides always-on observability and analysis of NCCL communication patterns. It offers detailed performance and metadata logging, allowing developers to analyze and debug collective operations efficiently. The plugin tracks each NCCL communicator individually, offering insights into performance patterns across different communication contexts.

Developer Experience Improvements

NCCL 2.28 enhances the developer experience with new APIs for operations like AllToAll, Gather, and Scatter. It introduces flexible configuration management through an environment plugin API, facilitating programmatic version matching and configuration storage agnostic setups. Additionally, the release supports CMake for Linux builds, streamlining integration into larger build pipelines.

For further details on NCCL 2.28 and its features, visit the official NVIDIA blog.

Image source: Shutterstock

Source: https://blockchain.news/news/nvidia-nccl-2-28-revolutionizes-gpu-communication

Market Opportunity
NodeAI Logo
NodeAI Price(GPU)
$0.03707
$0.03707$0.03707
+9.31%
USD
NodeAI (GPU) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Tags:

You May Also Like

CME Group to Launch Solana and XRP Futures Options

CME Group to Launch Solana and XRP Futures Options

The post CME Group to Launch Solana and XRP Futures Options appeared on BitcoinEthereumNews.com. An announcement was made by CME Group, the largest derivatives exchanger worldwide, revealed that it would introduce options for Solana and XRP futures. It is the latest addition to CME crypto derivatives as institutions and retail investors increase their demand for Solana and XRP. CME Expands Crypto Offerings With Solana and XRP Options Launch According to a press release, the launch is scheduled for October 13, 2025, pending regulatory approval. The new products will allow traders to access options on Solana, Micro Solana, XRP, and Micro XRP futures. Expiries will be offered on business days on a monthly, and quarterly basis to provide more flexibility to market players. CME Group said the contracts are designed to meet demand from institutions, hedge funds, and active retail traders. According to Giovanni Vicioso, the launch reflects high liquidity in Solana and XRP futures. Vicioso is the Global Head of Cryptocurrency Products for the CME Group. He noted that the new contracts will provide additional tools for risk management and exposure strategies. Recently, CME XRP futures registered record open interest amid ETF approval optimism, reinforcing confidence in contract demand. Cumberland, one of the leading liquidity providers, welcomed the development and said it highlights the shift beyond Bitcoin and Ethereum. FalconX, another trading firm, added that rising digital asset treasuries are increasing the need for hedging tools on alternative tokens like Solana and XRP. High Record Trading Volumes Demand Solana and XRP Futures Solana futures and XRP continue to gain popularity since their launch earlier this year. According to CME official records, many have bought and sold more than 540,000 Solana futures contracts since March. A value that amounts to over $22 billion dollars. Solana contracts hit a record 9,000 contracts in August, worth $437 million. Open interest also set a record at 12,500 contracts.…
Share
BitcoinEthereumNews2025/09/18 01:39
USD/CHF Forecast: US Dollar Plummets Toward 0.7850 as Fed Decision Looms

USD/CHF Forecast: US Dollar Plummets Toward 0.7850 as Fed Decision Looms

BitcoinWorld USD/CHF Forecast: US Dollar Plummets Toward 0.7850 as Fed Decision Looms The US Dollar continues its downward trajectory against the Swiss Franc,
Share
bitcoinworld2026/03/18 05:40
SEC CFTC Crypto Guidance: Landmark Joint Framework Clarifies Securities Law Application for Digital Assets

SEC CFTC Crypto Guidance: Landmark Joint Framework Clarifies Securities Law Application for Digital Assets

BitcoinWorld SEC CFTC Crypto Guidance: Landmark Joint Framework Clarifies Securities Law Application for Digital Assets WASHINGTON, D.C., March 15, 2025 – In a
Share
bitcoinworld2026/03/18 04:55