Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning



arXiv:2408.11974v1 Announce Type: new
Abstract: We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the form of $min_textbf{x} max_{textbf{y} in Y} f(textbf{x}, textbf{y})$, where the objective function $f(textbf{x}, textbf{y})$ is nonconvex in $textbf{x}$ and concave in $textbf{y}$, and the constraint set $Y subseteq mathbb{R}^n$ is convex and bounded. In the convex-concave setting, the single-timescale GDA achieves strong convergence guarantees and has been used for solving application problems arising from operations research and computer science. However, it can fail to converge in more general settings. Our contribution in this paper is to design the simple deterministic and stochastic TTGDA algorithms that efficiently find one stationary point of the function $Phi(cdot) := max_{textbf{y} in Y} f(cdot, textbf{y})$. Specifically, we prove the theoretical bounds on the complexity of solving both smooth and nonsmooth nonconvex-concave minimax optimization problems. To our knowledge, this is the first systematic analysis of TTGDA for nonconvex minimax optimization, shedding light on its superior performance in training generative adversarial networks (GANs) and in solving other real-world application problems.



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.