Optimizing Language Models for Grammatical Acceptability: A Comparative Study of Fine-Tuning Techniques

arXiv:2501.07853v1 Announce Type: new
Abstract: This study explores the fine-tuning (FT) of the Open Pre-trained Transformer (OPT-125M) for grammatical acceptability tasks using the CoLA dataset. By comparing Vanilla-Fine-Tuning (VFT), Pattern-Based-Fine-Tuning (PBFT), and Parameter-Efficient Fine-Tuning techniques (PEFT) like Low-Rank Adaptation (LoRA), we demonstrate significant improvements in computational efficiency while maintaining high accuracy. Our experiments reveal that while VFT achieves the highest accuracy (81.2%), LoRA enhancing FT by reducing memory usage and iteration time by more than 50%, and increases accuracy in PBFT case. Context Distillation (CD), though computationally efficient, underperformed with accuracy around 31%. Our findings contribute to democratizing access to large language models (LLM) by reducing computational barriers.

Source link
lol

Optimizing Language Models for Grammatical Acceptability: A Comparative Study of Fine-Tuning Techniques

By stp2y

Leave a Reply Cancel reply