Humans and transformer LMs: Abstraction drives language learning

Jasper Jian; Christopher D. Manning

Humans and transformer LMs: Abstraction drives language learning

Jasper Jian, Christopher D. Manning

Abstract

Categorization is a core component of human linguistic competence. We investigate how a transformer-based language model (LM) learns linguistic categories by comparing its behaviour over the course of training to behaviours which characterize abstract feature-based and concrete exemplar-based accounts of human language acquisition. We investigate how lexical semantic and syntactic categories emerge using novel divergence-based metrics that track learning trajectories using next-token distributions. In experiments with GPT-2 small, we find that (i) when a construction is learned, abstract class-level behaviour is evident at earlier steps than lexical item-specific behaviour, and (ii) that different linguistic behaviours emerge abruptly in sequence at different points in training, revealing that abstraction plays a key role in how LMs learn. This result informs the models of human language acquisition that LMs may serve as an existence proof for.

Humans and transformer LMs: Abstraction drives language learning

Abstract

Paper Structure (40 sections, 9 figures, 2 tables)

This paper contains 40 sections, 9 figures, 2 tables.

Introduction
Background
Abstraction-first and exemplar-first learning in children
Related work on language models
Experiment 1: What drives argument structure learning in LMs?
Data and Models
Empirical domain
Trajectories
Dataset
Models
Paradigm and Metrics
Paradigm
Item learning metric
Class learning metric
Results: Distributional metrics
...and 25 more sections

Figures (9)

Figure 1: (a) Categories, like verb classes determined by argument structure, are pervasive in language: to Dative verbs take recipient arguments, whereas Motion verbs take goal locations. (b) Divergence metrics compare LM prediction distributions conditioned on different categories. (c) Tracking divergences over training reveals that categories like verb classes are differentiated early by LMs, before gradual item-level learning.
Figure 2: Pairwise $D_{JS}$ between verbs in two classes at three early steps in training. Each row and column represents a single verb. Differentiation into verb classes is evident when any change has occurred -- between-class quadrants have higher $D_{JS}$ (top-right, bottom-left) than in-class quadrants (top-left, bottom-right).
Figure 3: Mann-Whitney U-test between $U_A = \{D_{JS}(P_{v_{t}}(x) || P_v(x)) : v, v_{t} \in C_1\}$ (same class), $U_B = \{D_{JS}(P_{v_{t}}(x) || P_v(x)) : v_{t} \in C_1, v \in C_2\}$ (different class), for verb classes $C_1, C_2$. Mean of $D_{JS}$ across verbs in the same class. Shading is standard deviation.
Figure 4: Mean pairwise Spearman correlations between prototype noun trajectories for each verb across training. Within-class and between-class means are reported with 95% CI.
Figure 5: Top: Pairwise $D_{JS}$ across two classes in exemplar-first baseline show verb-specific patterns emerge without clear class distinctions. Middle and bottom: Exemplar-first baseline (middle) shows increasing divergence between verbs driven by verb-specific learning, before decrease, while GPT-2 is monotonic (bottom); baseline between-class mean divergences are not consistently greater than in-class means, unlike GPT-2. Shading is 95% CI.
...and 4 more figures

Humans and transformer LMs: Abstraction drives language learning

Abstract

Humans and transformer LMs: Abstraction drives language learning

Authors

Abstract

Table of Contents

Figures (9)