Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

Ruichen Xu; Wenjing Yan; Ying-Jun Angela Zhang

Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

Ruichen Xu, Wenjing Yan, Ying-Jun Angela Zhang

TL;DR

A unified mechanism is revealed: transformers encode entities with similar properties into similar representations, enabling property transfer through feature alignment, enabling inductive reasoning capabilities.

Abstract

Understanding reasoning in large language models is complicated by evaluations that conflate multiple reasoning types. We isolate analogical reasoning (inferring shared properties between entities based on known similarities) and analyze its emergence in transformers. We theoretically prove three key results: (1) Joint training on similarity and attribution premises enables analogical reasoning through aligned representations; (2) Sequential training succeeds only when similarity structure is learned before specific attributes, revealing a necessary curriculum; (3) Two-hop reasoning ($a \to b, b \to c \implies a \to c$) reduces to analogical reasoning with identity bridges ($b = b$), which must appear explicitly in training data. These results reveal a unified mechanism: transformers encode entities with similar properties into similar representations, enabling property transfer through feature alignment. Experiments with architectures up to 1.5B parameters validate our theory and demonstrate how representational geometry shapes inductive reasoning capabilities.

Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

TL;DR

Abstract

) reduces to analogical reasoning with identity bridges (

), which must appear explicitly in training data. These results reveal a unified mechanism: transformers encode entities with similar properties into similar representations, enabling property transfer through feature alignment. Experiments with architectures up to 1.5B parameters validate our theory and demonstrate how representational geometry shapes inductive reasoning capabilities.

Paper Structure (79 sections, 33 theorems, 211 equations, 1 figure, 6 tables)

This paper contains 79 sections, 33 theorems, 211 equations, 1 figure, 6 tables.

Introduction
Related work
Mechanic interpretability for reasoning in LLMs.
Theoretical analyses for in LLM reasoning.
Notation.
Analogical Reasoning: Data Structure and Evaluation
Training dataset (premises).
Test Dataset (Analogical Conclusion).
Setup
Simplified one-layer transformers
Self-attention layer.
Linear MLP layer.
Loss functions
Training loss.
Test error.
...and 64 more sections

Key Result

Theorem 1

Suppose $\kappa = \Omega( \frac{m^{1/5}N^{1/5}\log^{2/5}(d)}{\lambda^{4/5}})$, $T_1 = \Theta(\frac{mn\log(d)}{\kappa\lambda^2\eta\sqrt{d}\sigma_0})$ and $T_2 = \Theta(\frac{\kappa mN^2}{\lambda^2d\sigma_0^2\eta })$. Under Condition condition:condition, with probability at least $1-\delta$, there exi

Figures (1)

Figure 1: Feature cosine similarity of data with the same labels of deep linear neural networks and GPT-2 trained on orthogonal data.

Theorems & Definitions (58)

Definition 1: Analogical argument
Remark 1
Theorem 1: Joint training on $\biguplus_{k=1}^{\kappa}(\mathcal{S}_1 \cup \mathcal{S}_2) \biguplus\mathcal{S}_3$
Proposition 1: Feature similarity
Remark 2: On the large-$\kappa$ regime
Remark 3
Theorem 2: S$\to$A curriculum enables analogical reasoning
Proposition 2: Feature similarity in S$\to$A curriculum
Theorem 3: A$\to$S curriculum fails to enable analogical reasoning
Proposition 3: No feature alignment in A$\to$S curriculum
...and 48 more

Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

TL;DR

Abstract

Feature Resemblance: On the Theoretical Understanding of Analogical Reasoning in Transformers

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (58)