Experiences from Creating a Benchmark for Sentiment Classification for Varieties of English

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


View a PDF of the paper titled Experiences from Creating a Benchmark for Sentiment Classification for Varieties of English, by Dipankar Srirag and 3 other authors

View PDF
HTML (experimental)

Abstract:Existing benchmarks often fail to account for linguistic diversity, like language variants of English. In this paper, we share our experiences from our ongoing project of building a sentiment classification benchmark for three variants of English: Australian (en-AU), Indian (en-IN), and British (en-UK) English. Using Google Places reviews, we explore the effects of various sampling techniques based on label semantics, review length, and sentiment proportion and report performances on three fine-tuned BERT-based models. Our initial evaluation reveals significant performance variations influenced by sample characteristics, label semantics, and language variety, highlighting the need for nuanced benchmark design. We offer actionable insights for researchers to create robust benchmarks, emphasising the importance of diverse sampling, careful label definition, and comprehensive evaluation across linguistic varieties.

Submission history

From: Dipankar Srirag [view email]
[v1]
Tue, 15 Oct 2024 03:02:03 UTC (794 KB)
[v2]
Wed, 13 Nov 2024 04:16:21 UTC (999 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.