Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters

stp2yDecember 10, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 17 Oct 2023 (v1), last revised 7 Dec 2024 (this version, v2)]

View a PDF of the paper titled Domain Generalization Using Large Pretrained Models with Mixture-of-Adapters, by Gyuseong Lee and 4 other authors

View PDF
HTML (experimental)

Abstract:Learning robust vision models that perform well in out-of-distribution (OOD) situations is an important task for model deployment in real-world settings. Despite extensive research in this field, many proposed methods have only shown minor performance improvements compared to the simplest empirical risk minimization (ERM) approach, which was evaluated on a benchmark with a limited hyperparameter search space. Our focus in this study is on leveraging the knowledge of large pretrained models to improve handling of OOD scenarios and tackle domain generalization problems. However, prior research has revealed that naively fine-tuning a large pretrained model can impair OOD robustness. Thus, we employ parameter-efficient fine-tuning (PEFT) techniques to effectively preserve OOD robustness while working with large models. Our extensive experiments and analysis confirm that the most effective approaches involve ensembling diverse models and increasing the scale of pretraining. As a result, we achieve state-of-the-art performance in domain generalization tasks. Our code and project page are available at: this https URL

Submission history

From: Gyuseong Lee [view email]
[v1]
Tue, 17 Oct 2023 07:01:24 UTC (9,044 KB)
[v2]
Sat, 7 Dec 2024 06:57:05 UTC (11,385 KB)

Source link
lol

By stp2y