arXiv:2408.11974v1 Announce Type: new
Abstract: We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the form of $min_textbf{x} max_{textbf{y} in Y} f(textbf{x}, textbf{y})$, where the objective function $f(textbf{x}, textbf{y})$ is nonconvex in $textbf{x}$ and concave in $textbf{y}$, and the constraint set $Y subseteq mathbb{R}^n$ is convex and bounded. In the convex-concave setting, the single-timescale GDA achieves strong convergence guarantees and has been used for solving application problems arising from operations research and computer science. However, it can fail to converge in more general settings. Our contribution in this paper is to design the simple deterministic and stochastic TTGDA algorithms that efficiently find one stationary point of the function $Phi(cdot) := max_{textbf{y} in Y} f(cdot, textbf{y})$. Specifically, we prove the theoretical bounds on the complexity of solving both smooth and nonsmooth nonconvex-concave minimax optimization problems. To our knowledge, this is the first systematic analysis of TTGDA for nonconvex minimax optimization, shedding light on its superior performance in training generative adversarial networks (GANs) and in solving other real-world application problems.
Source link
lol