View a PDF of the paper titled GenSafe: A Generalizable Safety Enhancer for Safe Reinforcement Learning Algorithms Based on Reduced Order Markov Decision Process Model, by Zhehua Zhou and 4 other authors
Abstract:Safe Reinforcement Learning (SRL) aims to realize a safe learning process for Deep Reinforcement Learning (DRL) algorithms by incorporating safety constraints. However, the efficacy of SRL approaches often relies on accurate function approximations, which are notably challenging to achieve in the early learning stages due to data insufficiency. To address this issue, we introduce in this work a novel Generalizable Safety enhancer (GenSafe) that is able to overcome the challenge of data insufficiency and enhance the performance of SRL approaches. Leveraging model order reduction techniques, we first propose an innovative method to construct a Reduced Order Markov Decision Process (ROMDP) as a low-dimensional approximator of the original safety constraints. Then, by solving the reformulated ROMDP-based constraints, GenSafe refines the actions of the agent to increase the possibility of constraint satisfaction. Essentially, GenSafe acts as an additional safety layer for SRL algorithms. We evaluate GenSafe on multiple SRL approaches and benchmark problems. The results demonstrate its capability to improve safety performance, especially in the early learning phases, while maintaining satisfactory task performance. Our proposed GenSafe not only offers a novel measure to augment existing SRL methods but also shows broad compatibility with various SRL algorithms, making it applicable to a wide range of systems and SRL problems.
Submission history
From: Zhehua Zhou [view email]
[v1]
Thu, 6 Jun 2024 09:51:30 UTC (5,261 KB)
[v2]
Tue, 14 Jan 2025 10:32:32 UTC (14,390 KB)
Source link
lol