Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning

stp2yJanuary 16, 20250 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 31 Oct 2024 (v1), last revised 15 Jan 2025 (this version, v2)]

View a PDF of the paper titled Compositional Automata Embeddings for Goal-Conditioned Reinforcement Learning, by Beyazit Yalcinkaya and 3 other authors

View PDF
HTML (experimental)

Abstract:Goal-conditioned reinforcement learning is a powerful way to control an AI agent’s behavior at runtime. That said, popular goal representations, e.g., target states or natural language, are either limited to Markovian tasks or rely on ambiguous task semantics. We propose representing temporal goals using compositions of deterministic finite automata (cDFAs) and use cDFAs to guide RL agents. cDFAs balance the need for formal temporal semantics with ease of interpretation: if one can understand a flow chart, one can understand a cDFA. On the other hand, cDFAs form a countably infinite concept class with Boolean semantics, and subtle changes to the automaton can result in very different tasks, making them difficult to condition agent behavior on. To address this, we observe that all paths through a DFA correspond to a series of reach-avoid tasks and propose pre-training graph neural network embeddings on “reach-avoid derived” DFAs. Through empirical evaluation, we demonstrate that the proposed pre-training method enables zero-shot generalization to various cDFA task classes and accelerated policy specialization without the myopic suboptimality of hierarchical methods.

Submission history

From: Beyazit Yalcinkaya [view email]
[v1]
Thu, 31 Oct 2024 20:56:07 UTC (42,648 KB)
[v2]
Wed, 15 Jan 2025 01:46:25 UTC (42,648 KB)

Source link
lol

By stp2y