SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning

stp2ySeptember 30, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 24 Jan 2024 (v1), last revised 27 Sep 2024 (this version, v4)]

View a PDF of the paper titled SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning, by Guoxin Chen and Kexin Tang and Chao Yang and Fuying Ye and Yu Qiao and Yiming Qian

Abstract:Elucidating the reasoning process with structured explanations from question to answer is crucial, as it significantly enhances the interpretability, traceability, and trustworthiness of question-answering (QA) systems. However, structured explanations demand models to perform intricately structured reasoning, which poses great challenges. Most existing methods focus on single-step reasoning through supervised learning, ignoring logical dependencies between steps. Moreover, existing reinforcement learning (RL) based methods overlook the structured relationships, underutilizing the potential of RL in structured reasoning. In this paper, we propose SEER, a novel method that maximizes a structure-based return to facilitate structured reasoning and explanation. Our proposed structure-based return precisely describes the hierarchical and branching structure inherent in structured reasoning, effectively capturing the intricate relationships between different reasoning steps. In addition, we introduce a fine-grained reward function to meticulously delineate diverse reasoning steps. Extensive experiments show that SEER significantly outperforms state-of-the-art methods, achieving an absolute improvement of 6.9% over RL-based methods on EntailmentBank, a 4.4% average improvement on STREET benchmark, and exhibiting outstanding efficiency and cross-dataset generalization performance. Our code is available at this https URL.

Submission history

From: Guoxin Chen [view email]
[v1]
Wed, 24 Jan 2024 06:10:51 UTC (7,880 KB)
[v2]
Fri, 16 Feb 2024 14:16:06 UTC (7,856 KB)
[v3]
Tue, 4 Jun 2024 06:14:56 UTC (7,862 KB)
[v4]
Fri, 27 Sep 2024 08:26:01 UTC (1,010 KB)

Source link
lol

By stp2y