Table of Contents
Fetching ...

On Representational Dissociation of Language and Arithmetic in Large Language Models

Riku Kisako, Tatsuki Kuribayashi, Ryohei Sasano

TL;DR

This work investigates whether large language models internally dissociate language processing from arithmetic reasoning. It employs linear probing with a linear SVM and GDV-based cluster separability to quantify separations across multiple data types (Lang, LangNum, Eq, EqSp, LangNumEq, GSM8K) and models, observing a clear, layerwise separation from the first layer. The results show near-perfect linear separability between language and arithmetic representations and negative GDV scores indicating persistent separation, though complex math-word-problem tasks form separate, language-like clusters rather than following a simple pipeline. These findings advance the language–thought dissociation discussion in AI systems and motivate causal, cross-model, and brain-alignment studies to validate and extend the observed modularity.

Abstract

The association between language and (non-linguistic) thinking ability in humans has long been debated, and recently, neuroscientific evidence of brain activity patterns has been considered. Such a scientific context naturally raises an interdisciplinary question -- what about such a language-thought dissociation in large language models (LLMs)? In this paper, as an initial foray, we explore this question by focusing on simple arithmetic skills (e.g., $1+2=$ ?) as a thinking ability and analyzing the geometry of their encoding in LLMs' representation space. Our experiments with linear classifiers and cluster separability tests demonstrate that simple arithmetic equations and general language input are encoded in completely separated regions in LLMs' internal representation space across all the layers, which is also supported with more controlled stimuli (e.g., spelled-out equations). These tentatively suggest that arithmetic reasoning is mapped into a distinct region from general language input, which is in line with the neuroscientific observations of human brain activations, while we also point out their somewhat cognitively implausible geometric properties.

On Representational Dissociation of Language and Arithmetic in Large Language Models

TL;DR

This work investigates whether large language models internally dissociate language processing from arithmetic reasoning. It employs linear probing with a linear SVM and GDV-based cluster separability to quantify separations across multiple data types (Lang, LangNum, Eq, EqSp, LangNumEq, GSM8K) and models, observing a clear, layerwise separation from the first layer. The results show near-perfect linear separability between language and arithmetic representations and negative GDV scores indicating persistent separation, though complex math-word-problem tasks form separate, language-like clusters rather than following a simple pipeline. These findings advance the language–thought dissociation discussion in AI systems and motivate causal, cross-model, and brain-alignment studies to validate and extend the observed modularity.

Abstract

The association between language and (non-linguistic) thinking ability in humans has long been debated, and recently, neuroscientific evidence of brain activity patterns has been considered. Such a scientific context naturally raises an interdisciplinary question -- what about such a language-thought dissociation in large language models (LLMs)? In this paper, as an initial foray, we explore this question by focusing on simple arithmetic skills (e.g., ?) as a thinking ability and analyzing the geometry of their encoding in LLMs' representation space. Our experiments with linear classifiers and cluster separability tests demonstrate that simple arithmetic equations and general language input are encoded in completely separated regions in LLMs' internal representation space across all the layers, which is also supported with more controlled stimuli (e.g., spelled-out equations). These tentatively suggest that arithmetic reasoning is mapped into a distinct region from general language input, which is in line with the neuroscientific observations of human brain activations, while we also point out their somewhat cognitively implausible geometric properties.

Paper Structure

This paper contains 29 sections, 9 figures, 3 tables.

Figures (9)

  • Figure 1: An illustration of our perspective to investigate the language-arithmetic representational dissociation within language models (LMs) --- if brain imaging renders that the human brain activation patterns are different against linguistic and (non-linguistic) reasoning stimuli, what about LMs?
  • Figure 2: The GDV between clusters of interest (for Gemma-2). "Language vs. Arithmetic" is the distance between Lang$\oplus$LangNum and Eq$\oplus$EqSp.
  • Figure 3: Liner classification results of gemma-2-9b-it
  • Figure 4: Liner classification results of Llama-3.1-8B-Instruction
  • Figure 5: Liner classification results of Qwen2.5-7B-Instruct
  • ...and 4 more figures