Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance

Molly R. Petersen; Lonneke van der Plas

Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance

Molly R. Petersen, Lonneke van der Plas

TL;DR

This work asks whether language models can learn analogical reasoning beyond static embeddings by introducing targeted training objectives that align relational structures, using $a:b::c:d$ as the core form. It proposes SBERT-inspired word-pair representations and three fine-tuning strategies (Simple Classifier, BERT a-b, BERT a-c) evaluated on SAT, U2/U4, and SCAN, with human distractor baselines and external semantic tasks. The findings show that the a-b objective yields measurable gains, especially on complex SCAN analogies, and ranking tasks outperform simple classification, with models approaching human performance on unseen analogies while preserving or improving external task performance. The results suggest that analogical reasoning can be learned from limited data and that such learning can transfer to related semantic tasks, though limitations include dataset size and permutation of analogies, pointing to future work on alternative relational measures and knowledge-enhanced training.

Abstract

While analogies are a common way to evaluate word embeddings in NLP, it is also of interest to investigate whether or not analogical reasoning is a task in itself that can be learned. In this paper, we test several ways to learn basic analogical reasoning, specifically focusing on analogies that are more typical of what is used to evaluate analogical reasoning in humans than those in commonly used NLP benchmarks. Our experiments find that models are able to learn analogical reasoning, even with a small amount of data. We additionally compare our models to a dataset with a human baseline, and find that after training, models approach human performance.

Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance

TL;DR

This work asks whether language models can learn analogical reasoning beyond static embeddings by introducing targeted training objectives that align relational structures, using

as the core form. It proposes SBERT-inspired word-pair representations and three fine-tuning strategies (Simple Classifier, BERT a-b, BERT a-c) evaluated on SAT, U2/U4, and SCAN, with human distractor baselines and external semantic tasks. The findings show that the a-b objective yields measurable gains, especially on complex SCAN analogies, and ranking tasks outperform simple classification, with models approaching human performance on unseen analogies while preserving or improving external task performance. The results suggest that analogical reasoning can be learned from limited data and that such learning can transfer to related semantic tasks, though limitations include dataset size and permutation of analogies, pointing to future work on alternative relational measures and knowledge-enhanced training.

Abstract

Paper Structure (25 sections, 1 figure, 27 tables)

This paper contains 25 sections, 1 figure, 27 tables.

Introduction
Related Work
Methods
SBERT-modifications
Models
Model 1: Simple Classifier
Model 2: BERT a-b
Model 3: BERT a-c
Baselines: No fine-tuning and FastText
Datasets
SAT Dataset
U2 and U4
Scientific and Creative Analogy dataset (SCAN)
Human Baseline Comparison: Distractor Dataset
External Tasks: Semantic Similarity
...and 10 more sections

Figures (1)

Figure 1: Distribution of estimated word frequency seen in pre-training data, by number of token per word (among words seen <100,000 times)

Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance

TL;DR

Abstract

Can language models learn analogical reasoning? Investigating training objectives and comparisons to human performance

Authors

TL;DR

Abstract

Table of Contents

Figures (1)