Revolutionizing Large Language Model Training through Dynamic Parameter Adjustment

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning



arXiv:2406.06564v1 Announce Type: new
Abstract: In the era of large language models, the demand for efficient use of computational resources has become critically important. Although parameter-efficient fine-tuning techniques have achieved results comparable to full fine-tuning, their application during the pre-training phase poses significant challenges. Specifically, employing parameter-efficient strategies at the onset of pre-training can severely compromise efficiency, especially in larger models. In this paper, building upon the fine-tuning method LoRA, we introduce a novel parameter-efficient training technique that frequently alters trainable part of parameters, facilitating effective pre-training. Our method not only achieves memory reductions and computational overhead comparable to current state-of-the-art parameter-efficient algorithms during the pre-training phase but also maintains accuracy levels comparable to those of full pre-training. We provide both theoretical analyses and empirical evidence to demonstrate the effectiveness of our approach.



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.