Measuring Sound Symbolism in Audio-visual Models

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning



arXiv:2409.12306v1 Announce Type: new
Abstract: Audio-visual pre-trained models have gained substantial attention recently and demonstrated superior performance on various audio-visual tasks. This study investigates whether pre-trained audio-visual models demonstrate non-arbitrary associations between sounds and visual representations$unicode{x2013}$known as sound symbolism$unicode{x2013}$which is also observed in humans. We developed a specialized dataset with synthesized images and audio samples and assessed these models using a non-parametric approach in a zero-shot setting. Our findings reveal a significant correlation between the models’ outputs and established patterns of sound symbolism, particularly in models trained on speech data. These results suggest that such models can capture sound-meaning connections akin to human language processing, providing insights into both cognitive architectures and machine learning strategies.



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.