Emergence of a High-Dimensional Abstraction Phase in Language Transformers

Pile-T5


View a PDF of the paper titled Emergence of a High-Dimensional Abstraction Phase in Language Transformers, by Emily Cheng and Diego Doimo and Corentin Kervadec and Iuri Macocco and Jade Yu and Alessandro Laio and Marco Baroni

View PDF
HTML (experimental)

Abstract:A language model (LM) is a mapping from a linguistic context to an output token. However, much remains to be known about this mapping, including how its geometric properties relate to its function. We take a high-level geometric approach to its analysis, observing, across five pre-trained transformer-based LMs and three input datasets, a distinct phase characterized by high intrinsic dimensionality. During this phase, representations (1) correspond to the first full linguistic abstraction of the input; (2) are the first to viably transfer to downstream tasks; (3) predict each other across different LMs. Moreover, we find that an earlier onset of the phase strongly predicts better language modelling performance. In short, our results suggest that a central high-dimensionality phase underlies core linguistic processing in many common LM architectures.

Submission history

From: Emily Cheng [view email]
[v1]
Fri, 24 May 2024 11:49:07 UTC (8,818 KB)
[v2]
Fri, 20 Dec 2024 23:09:46 UTC (9,946 KB)



Source link
lol

By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.