Improved Particle Approximation Error for Mean Field Neural Networks

stp2yOctober 31, 20240 Comments

Architecture of OpenAI

[Submitted on 24 May 2024 (v1), last revised 30 Oct 2024 (this version, v3)]

View a PDF of the paper titled Improved Particle Approximation Error for Mean Field Neural Networks, by Atsushi Nitanda

View PDF
HTML (experimental)

Abstract:Mean-field Langevin dynamics (MFLD) minimizes an entropy-regularized nonlinear convex functional defined over the space of probability distributions. MFLD has gained attention due to its connection with noisy gradient descent for mean-field two-layer neural networks. Unlike standard Langevin dynamics, the nonlinearity of the objective functional induces particle interactions, necessitating multiple particles to approximate the dynamics in a finite-particle setting. Recent works (Chen et al., 2022; Suzuki et al., 2023b) have demonstrated the uniform-in-time propagation of chaos for MFLD, showing that the gap between the particle system and its mean-field limit uniformly shrinks over time as the number of particles increases. In this work, we improve the dependence on logarithmic Sobolev inequality (LSI) constants in their particle approximation errors, which can exponentially deteriorate with the regularization coefficient. Specifically, we establish an LSI-constant-free particle approximation error concerning the objective gap by leveraging the problem structure in risk minimization. As the application, we demonstrate improved convergence of MFLD, sampling guarantee for the mean-field stationary distribution, and uniform-in-time Wasserstein propagation of chaos in terms of particle complexity.

Submission history

From: Atsushi Nitanda [view email]
[v1]
Fri, 24 May 2024 17:59:06 UTC (55 KB)
[v2]
Fri, 14 Jun 2024 13:20:06 UTC (55 KB)
[v3]
Wed, 30 Oct 2024 14:24:34 UTC (56 KB)

Source link
lol

By stp2y