arXiv:2410.03706v1 Announce Type: new
Abstract: The goal of this work is to serve as a foundation for deep studies of the topology of state, action, and policy spaces in reinforcement learning. By studying these spaces from a mathematical perspective, we expect to gain more insight into how to build better algorithms to solve decision problems. Therefore, we focus on presenting the connection between the Banach fixed point theorem and the convergence of reinforcement learning algorithms, and we illustrate how the insights gained from this can practically help in designing more efficient algorithms. Before doing so, however, we first introduce relevant concepts such as metric spaces, normed spaces and Banach spaces for better understanding, before expressing the entire reinforcement learning problem in terms of Markov decision processes. This allows us to properly introduce the Banach contraction principle in a language suitable for reinforcement learning, and to write the Bellman equations in terms of operators on Banach spaces to show why reinforcement learning algorithms converge. Finally, we show how the insights gained from the mathematical study of convergence are helpful in reasoning about the best ways to make reinforcement learning algorithms more efficient.
Source link
lol