How Do Language Models Compose Functions?
Apoorv Khandelwal, Ellie Pavlick
TL;DR
The paper investigates whether large language models solve two-hop compositional tasks via explicit compositional mechanisms or idiomatic shortcuts, framing tasks as $y = g(f(x))$ and examining intermediate representations. Using logit lens analyses of residual streams and embedding-space linearity tests, it uncovers two processing modes—compositional (with a detectable $f(x)$ intermediate) and direct (no signature of intermediates)—with the mode chosen in part by the linearity of the embedding-relations between $x$ and $g(f(x))$. The key findings show a persistent compositionality gap across modern models, even as model size and reasoning capabilities reduce the gap for some tasks; embedding-space linearity strongly predicts the dominance of idiomatic processing. The work highlights the nuanced relationship between representation geometry and processing strategies in LLMs, offering interpretability-based insights into when compositional reasoning emerges and suggesting causal interventions as a route to further understanding. These results have implications for theories of compositionality and generalization, indicating that effective compositional behavior can arise from non-symbolic mechanisms shaped by pretraining rather than explicit symbolic architectures.
Abstract
While large language models (LLMs) appear to be increasingly capable of solving compositional tasks, it is an open question whether they do so using compositional mechanisms. In this work, we investigate how feedforward LLMs solve two-hop factual recall tasks, which can be expressed compositionally as $g(f(x))$. We first confirm that modern LLMs continue to suffer from the "compositionality gap": i.e. their ability to compute both $z = f(x)$ and $y = g(z)$ does not entail their ability to compute the composition $y = g(f(x))$. Then, using logit lens on their residual stream activations, we identify two processing mechanisms, one which solves tasks $\textit{compositionally}$, computing $f(x)$ along the way to computing $g(f(x))$, and one which solves them $\textit{directly}$, without any detectable signature of the intermediate variable $f(x)$. Finally, we find that which mechanism is employed appears to be related to the embedding space geometry, with the idiomatic mechanism being dominant in cases where there exists a linear mapping from $x$ to $g(f(x))$ in the embedding spaces. We fully release our data and code at: https://github.com/apoorvkh/composing-functions .
