MLPs Learn In-Context on Regression and Classification Tasks

[Submitted on 24 May 2024 (v1), last revised 26 Sep 2024 (this version, v2)]

View a PDF of the paper titled MLPs Learn In-Context on Regression and Classification Tasks, by William L. Tong and Cengiz Pehlevan

View PDF
HTML (experimental)

Abstract:In-context learning (ICL), the remarkable ability to solve a task from only input exemplars, is often assumed to be a unique hallmark of Transformer models. By examining commonly employed synthetic ICL tasks, we demonstrate that multi-layer perceptrons (MLPs) can also learn in-context. Moreover, MLPs, and the closely related MLP-Mixer models, learn in-context competitively with Transformers given the same compute budget in this setting. We further show that MLPs outperform Transformers on a series of classical tasks from psychology designed to test relational reasoning, which are closely related to in-context classification. These results underscore a need for studying in-context learning beyond attention-based architectures, while also challenging strong prior arguments about MLPs’ limited ability to solve relational tasks. Altogether, our results highlight the unexpected competence of MLPs, and support the growing interest in all-MLP alternatives to task-specific architectures.