Table of Contents
Fetching ...

The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces

Ahmed Oumar El-Shangiti, Tatsuya Hiraoka, Hilal AlQuabeh, Benjamin Heinzerling, Kentaro Inui

TL;DR

The paper investigates whether numerical reasoning in large language models relies on low-dimensional linear subspaces that encode numeric attributes. It identifies these subspaces with Partial Least Squares regression on contextual activations and tests causality by intervening along the first PLS component, observing changes in the model's Yes/No reasoning outcomes. Across three numeric properties (birth year, death year, latitude) and three instruction-tuned LLMs, the authors report $R^2>0.8$ for attribute prediction and demonstrate causal effects of subspace interventions, especially in earlier layers, supporting a two-step mechanism: extract numeric attributes from a linear subspace and perform reasoning using those directions. The findings advance interpretability of numerical reasoning in LLMs and suggest concrete avenues for probing and controlling how numeric information is represented and used internally.

Abstract

This paper investigates whether large language models (LLMs) utilize numerical attributes encoded in a low-dimensional subspace of the embedding space when answering questions involving numeric comparisons, e.g., Was Cristiano born before Messi? We first identified, using partial least squares regression, these subspaces, which effectively encode the numerical attributes associated with the entities in comparison prompts. Further, we demonstrate causality, by intervening in these subspaces to manipulate hidden states, thereby altering the LLM's comparison outcomes. Experiments conducted on three different LLMs showed that our results hold across different numerical attributes, indicating that LLMs utilize the linearly encoded information for numerical reasoning.

The Geometry of Numerical Reasoning: Language Models Compare Numeric Properties in Linear Subspaces

TL;DR

The paper investigates whether numerical reasoning in large language models relies on low-dimensional linear subspaces that encode numeric attributes. It identifies these subspaces with Partial Least Squares regression on contextual activations and tests causality by intervening along the first PLS component, observing changes in the model's Yes/No reasoning outcomes. Across three numeric properties (birth year, death year, latitude) and three instruction-tuned LLMs, the authors report for attribute prediction and demonstrate causal effects of subspace interventions, especially in earlier layers, supporting a two-step mechanism: extract numeric attributes from a linear subspace and perform reasoning using those directions. The findings advance interpretability of numerical reasoning in LLMs and suggest concrete avenues for probing and controlling how numeric information is represented and used internally.

Abstract

This paper investigates whether large language models (LLMs) utilize numerical attributes encoded in a low-dimensional subspace of the embedding space when answering questions involving numeric comparisons, e.g., Was Cristiano born before Messi? We first identified, using partial least squares regression, these subspaces, which effectively encode the numerical attributes associated with the entities in comparison prompts. Further, we demonstrate causality, by intervening in these subspaces to manipulate hidden states, thereby altering the LLM's comparison outcomes. Experiments conducted on three different LLMs showed that our results hold across different numerical attributes, indicating that LLMs utilize the linearly encoded information for numerical reasoning.

Paper Structure

This paper contains 24 sections, 4 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Summary of our approach. We extract contextualized numeric attribute activations and then train $k$-components PLS model on the activations to predict their values and then use the first component of the PLS model to do an intervention at the last token of the second entity in the logical comparison.
  • Figure 2: The $R^2$ score of predicting entity's numerical attributes, using a 5-Component PLS model.
  • Figure 3: The effect of the intervention—specifically, the ratio of flipped answers after performing intervention—was analyzed within the identified model subspace of each layer and compared to the effects observed in a randomly selected direction sampled from a normal distribution.
  • Figure 4: $R^2$ score of predicting entity's birth years attributes, using a 5-Component PLS model trained on Mistral 7B Instruct activations.
  • Figure 5: $R^2$ score of predicting entity's birth years attributes, using a 5-Component PLS model trained on Qwen2.5 7B Instruct activations.
  • ...and 4 more figures