Rethinking Distribution Shifts: Empirical Analysis and Inductive Modeling for Tabular Data

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning


View a PDF of the paper titled Rethinking Distribution Shifts: Empirical Analysis and Inductive Modeling for Tabular Data, by Jiashuo Liu and 3 other authors

View PDF

Abstract:Different distribution shifts require different interventions, and algorithms must be grounded in the specific shifts they address. However, methodological development for robust algorithms typically relies on structural assumptions that lack empirical validation. Advocating for an empirically grounded data-driven approach to research, we build an empirical testbed comprising natural shifts across 5 tabular datasets and 60,000 method configurations encompassing imbalanced learning and distributionally robust optimization (DRO) methods. We find $Y|X$-shifts are most prevalent on our testbed, in stark contrast to the heavy focus on $X$ (covariate)-shifts in the ML literature. The performance of robust algorithms varies significantly over shift types, and is no better than that of vanilla methods. To understand why, we conduct an in-depth empirical analysis of DRO methods and find that although often neglected by researchers, implementation details — such as the choice of underlying model class (e.g., XGBoost) and hyperparameter selection — have a bigger impact on performance than the ambiguity set or its radius. To further bridge that gap between methodological research and practice, we design case studies that illustrate how such a data-driven, inductive understanding of distribution shifts can enhance both data-centric and algorithmic interventions.

Submission history

From: Tianyu Wang [view email]
[v1]
Tue, 11 Jul 2023 14:25:10 UTC (8,143 KB)
[v2]
Sun, 23 Jun 2024 03:30:50 UTC (5,958 KB)
[v3]
Fri, 12 Jul 2024 12:54:37 UTC (12,311 KB)
[v4]
Wed, 13 Nov 2024 15:53:37 UTC (12,663 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.