View a PDF of the paper titled Symbolic Regression with a Learned Concept Library, by Arya Grayeli and 4 other authors
Abstract:We present a novel method for symbolic regression (SR), the task of searching for compact programmatic hypotheses that best explain a dataset. The problem is commonly solved using genetic algorithms; we show that we can enhance such methods by inducing a library of abstract textual concepts. Our algorithm, called LaSR, uses zero-shot queries to a large language model (LLM) to discover and evolve concepts occurring in known high-performing hypotheses. We discover new hypotheses using a mix of standard evolutionary steps and LLM-guided steps (obtained through zero-shot LLM queries) conditioned on discovered concepts. Once discovered, hypotheses are used in a new round of concept abstraction and evolution. We validate LaSR on the Feynman equations, a popular SR benchmark, as well as a set of synthetic tasks. On these benchmarks, LaSR substantially outperforms a variety of state-of-the-art SR approaches based on deep learning and evolutionary algorithms. Moreover, we show that LaSR can be used to discover a novel and powerful scaling law for LLMs.
Submission history
From: Atharva Sehgal [view email]
[v1]
Sat, 14 Sep 2024 08:17:30 UTC (1,893 KB)
[v2]
Thu, 31 Oct 2024 19:02:17 UTC (1,896 KB)
[v3]
Tue, 10 Dec 2024 16:24:48 UTC (1,896 KB)
Source link
lol