Does Vision Accelerate Hierarchical Generalization in Neural Language Learners?

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


View a PDF of the paper titled Does Vision Accelerate Hierarchical Generalization in Neural Language Learners?, by Tatsuki Kuribayashi and 1 other authors

View PDF
HTML (experimental)

Abstract:Neural language models (LMs) are arguably less data-efficient than humans from a language acquisition perspective. One fundamental question is why this human-LM gap arises. This study explores the advantage of grounded language acquisition, specifically the impact of visual information — which humans can usually rely on but LMs largely do not have access to during language acquisition — on syntactic generalization in LMs. Our experiments, following the poverty of stimulus paradigm under two scenarios (using artificial vs. naturalistic images), demonstrate that if the alignments between the linguistic and visual components are clear in the input, access to vision data does help with the syntactic generalization of LMs, but if not, visual input does not help. This highlights the need for additional biases or signals, such as mutual gaze, to enhance cross-modal alignment and enable efficient syntactic generalization in multimodal LMs.

Submission history

From: Tatsuki Kuribayashi [view email]
[v1]
Wed, 1 Feb 2023 18:53:42 UTC (1,564 KB)
[v2]
Tue, 1 Oct 2024 16:29:14 UTC (1,600 KB)
[v3]
Tue, 17 Dec 2024 08:57:43 UTC (1,577 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.