This article explores the implementation of gradient descent algorithms for minimizing global loss functions in neural networks, particularly in problems governed by Rankine-Hugoniot conditions. While gradient descent reliably converges, scalability issues arise when handling large domains with many coupled networks. To address this, a domain decomposition method (DDM) is introduced, enabling parallel optimization of local loss functions. The result is faster convergence, improved scalability, and a more efficient framework for training complex AI models.This article explores the implementation of gradient descent algorithms for minimizing global loss functions in neural networks, particularly in problems governed by Rankine-Hugoniot conditions. While gradient descent reliably converges, scalability issues arise when handling large domains with many coupled networks. To address this, a domain decomposition method (DDM) is introduced, enabling parallel optimization of local loss functions. The result is faster convergence, improved scalability, and a more efficient framework for training complex AI models.

Why Gradient Descent Converges (and Sometimes Doesn’t) in Neural Networks

2025/09/19 18:38

Abstract and 1. Introduction

1.1. Introductory remarks

1.2. Basics of neural networks

1.3. About the entropy of direct PINN methods

1.4. Organization of the paper

  1. Non-diffusive neural network solver for one dimensional scalar HCLs

    2.1. One shock wave

    2.2. Arbitrary number of shock waves

    2.3. Shock wave generation

    2.4. Shock wave interaction

    2.5. Non-diffusive neural network solver for one dimensional systems of CLs

    2.6. Efficient initial wave decomposition

  2. Gradient descent algorithm and efficient implementation

    3.1. Classical gradient descent algorithm for HCLs

    3.2. Gradient descent and domain decomposition methods

  3. Numerics

    4.1. Practical implementations

    4.2. Basic tests and convergence for 1 and 2 shock wave problems

    4.3. Shock wave generation

    4.4. Shock-Shock interaction

    4.5. Entropy solution

    4.6. Domain decomposition

    4.7. Nonlinear systems

  4. Conclusion and References

3. Gradient descent algorithm and efficient implementation

In this section we discuss the implementation of gradient descent algorithms for solving the minimization problems (11), (20) and (35). We note that these problems involve a global loss functional measuring the residue of HCL in the whole domain, as well Rankine-Hugoniot conditions, which results in training of a number of neural networks. In all the tests we have done, the gradient descent method converges and provides accurate results. We note also, that in problems with a large number of DLs, the global loss functional couples a large number of networks and the gradient descent algorithm may converge slowly. For these problems we present a domain decomposition method (DDM).

3.1. Classical gradient descent algorithm for HCLs

All the problems (11), (20) and (35) being similar, we will demonstrate in details the algorithm for the problem (20). We assume that the solution is initially constituted by i) D ∈ {1, 2, . . . , } entropic shock waves emanating from x1, . . . , xD, ii) an arbitrary number of rarefaction waves, and that iii) there is no shock generation for t ∈ [0, T].

\

\

3.2. Gradient descent and domain decomposition methods

Rather than minimizing the global loss function (21) (or (12), (36)), we here propose to decouple the optimization of the neural networks, and make it scalable. The approach is closely connected to domain decomposition methods (DDMs) Schwarz Waveform Relaxation (SWR) methods [21, 22, 23]. The resulting algorithm allows for embarrassingly parallel computation of minimization of local loss functions.

\ \

\ \ \

\ \ \

\ \ In conclusion, the DDM becomes relevant thanks to its scalability and for kDDMkLocal < kGlobal, which is expected for D large.

\

:::info Authors:

(1) Emmanuel LORIN, School of Mathematics and Statistics, Carleton University, Ottawa, Canada, K1S 5B6 and Centre de Recherches Mathematiques, Universit´e de Montr´eal, Montreal, Canada, H3T 1J4 ([email protected]);

(2) Arian NOVRUZI, a Corresponding Author from Department of Mathematics and Statistics, University of Ottawa, Ottawa, ON K1N 6N5, Canada ([email protected]).

:::


:::info This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.

:::

\

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Strive’s $500M Bitcoin ATM Program Could Boost Stock Value Up to 30x in 10 Years

Strive’s $500M Bitcoin ATM Program Could Boost Stock Value Up to 30x in 10 Years

The post Strive’s $500M Bitcoin ATM Program Could Boost Stock Value Up to 30x in 10 Years appeared on BitcoinEthereumNews.com. Strive’s $500M SATA ATM program enables the issuance of preferred stock to fund Bitcoin acquisitions, enhance financial flexibility, and support long-term growth. This strategic move, filed with the SEC on December 9, 2025, positions the company to hold more BTC while potentially boosting stock value through compounding effects over 20 years. Strive’s $500M SATA ATM targets Bitcoin purchases and corporate expansion to build lasting financial strength. Financial projections suggest the stock could multiply 30 times in 10 years due to Bitcoin’s growth and leverage strategies. With 7,525 BTC already held as of November 7, 2025, sustained demand for SATA could elevate stock prices to $1,160 by year 20, per analyst models. Discover how Strive’s $500M SATA ATM program fuels Bitcoin strategy and stock growth. Learn projections, goals, and impacts in this detailed analysis. Stay ahead in crypto finance—explore now! What is Strive’s $500M SATA ATM Program? Strive’s $500M SATA ATM program is an at-the-market offering designed to issue up to $500 million in Variable Rate Series A Perpetual Preferred Stock, known as SATA. This initiative, detailed in a sales agreement filed with the Securities and Exchange Commission on December 9, 2025, provides Strive with flexible capital-raising options without fixed timelines or pricing commitments. The proceeds will primarily support Bitcoin holdings, acquisitions, debt repayment, and other corporate needs, reinforcing the company’s commitment to digital assets. How Does the SATA ATM Structure Support Bitcoin Growth? The SATA ATM allows Strive to sell shares opportunistically through broker-dealers, adapting to market conditions for optimal pricing. This structure minimizes dilution risks while generating funds for strategic investments. As of November 7, 2025, Strive already holds 7,525 BTC, and additional acquisitions via this program could amplify exposure to Bitcoin’s potential appreciation. Financial analyst Adam Livingston highlights the program’s role in “long-term intelligent leverage on Bitcoin,” enabling…
Share
BitcoinEthereumNews2025/12/10 23:15
Cryptos Signal Divergence Ahead of Fed Rate Decision

Cryptos Signal Divergence Ahead of Fed Rate Decision

The post Cryptos Signal Divergence Ahead of Fed Rate Decision appeared on BitcoinEthereumNews.com. Crypto assets send conflicting signals ahead of the Federal Reserve’s September rate decision. On-chain data reveals a clear decrease in Bitcoin and Ethereum flowing into centralized exchanges, but a sharp increase in altcoin inflows. The findings come from a Tuesday report by CryptoQuant, an on-chain data platform. The firm’s data shows a stark divergence in coin volume, which has been observed in movements onto centralized exchanges over the past few weeks. Bitcoin and Ethereum Inflows Drop to Multi-Month Lows Sponsored Sponsored Bitcoin has seen a dramatic drop in exchange inflows, with the 7-day moving average plummeting to 25,000 BTC, its lowest level in over a year. The average deposit per transaction has fallen to 0.57 BTC as of September. This suggests that smaller retail investors, rather than large-scale whales, are responsible for the recent cash-outs. Ethereum is showing a similar trend, with its daily exchange inflows decreasing to a two-month low. CryptoQuant reported that the 7-day moving average for ETH deposits on exchanges is around 783,000 ETH, the lowest in two months. Other Altcoins See Renewed Selling Pressure In contrast, other altcoin deposit activity on exchanges has surged. The number of altcoin deposit transactions on centralized exchanges was quite steady in May and June of this year, maintaining a 7-day moving average of about 20,000 to 30,000. Recently, however, that figure has jumped to 55,000 transactions. Altcoins: Exchange Inflow Transaction Count. Source: CryptoQuant CryptoQuant projects that altcoins, given their increased inflow activity, could face relatively higher selling pressure compared to BTC and ETH. Meanwhile, the balance of stablecoins on exchanges—a key indicator of potential buying pressure—has increased significantly. The report notes that the exchange USDT balance, around $273 million in April, grew to $379 million by August 31, marking a new yearly high. CryptoQuant interprets this surge as a reflection of…
Share
BitcoinEthereumNews2025/09/18 01:01