Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

Jaap Jumelet; Lisa Bylinina; Willem Zuidema; Jakub Szymanik

Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

Jaap Jumelet, Lisa Bylinina, Willem Zuidema, Jakub Szymanik

TL;DR

The paper investigates how adjective order preferences in large language models relate to training data statistics and sentence context. Using the CAP corpus and the Pythia model suite, it shows that AOPs largely align with $n$-gram frequencies and emerge early in training, but also persist in unseen orders, indicating generalization beyond memorization. Contextual cues—both local collocations and earlier semantic hints—systematically influence AOP, as revealed through attribution analyses. The work suggests a nuanced picture where distributional learning, generalization, and contextual information jointly shape word-order preferences in LMs, offering methodological tools for probing human-language phenomena with neural models.

Abstract

In English and other languages, multiple adjectives in noun phrases follow intricate ordering patterns. These patterns have been widely studied in linguistics and provide a useful test case for assessing how language models (LMs) acquire graded and context-sensitive word order preferences. We ask to what extent adjective order preferences in LMs can be explained by distributional learning alone, and where models exhibit behaviour that goes beyond surface co-occurrence patterns. We find that LM predictions are largely explained by training data frequencies: simple n-gram statistics account for much of their behaviour and closely mirror the preferences learned during training. However, by analysing learning dynamics we reveal that models also generalize robustly to unseen adjective combinations, indicating that their behaviour cannot be reduced to memorization of observed orders alone. Moreover, we show how LMs leverage word order cues from sentence context, demonstrating with feature attribution methods that contextual cues are an additional driver of adjective order in LM output.

Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

TL;DR

-gram frequencies and emerge early in training, but also persist in unseen orders, indicating generalization beyond memorization. Contextual cues—both local collocations and earlier semantic hints—systematically influence AOP, as revealed through attribution analyses. The work suggests a nuanced picture where distributional learning, generalization, and contextual information jointly shape word-order preferences in LMs, offering methodological tools for probing human-language phenomena with neural models.

Abstract

Paper Structure (43 sections, 5 equations, 10 figures, 2 tables)

This paper contains 43 sections, 5 equations, 10 figures, 2 tables.

Introduction
Background
Adjective Order Theory
Word Order in Language Models
Methods
Measuring Word Order Preference
Evaluation Corpus
Procedure
Models
AOP in LMs
Model Size
Learning Dynamics
Localizing AOP
Conclusion
The Role of Training Data Statistics in AOP
...and 28 more sections

Figures (10)

Figure 1: We connect the adjective order preferences (AOP-$\Delta$, §\ref{['sec:aop-description']}) of language models (here Pythia-12b) to the adjective order frequencies of the corpus they have been trained on (The Pile). We highlight various regions of interest: adjective pairs for which both orders are rare and that require the LM to generalize from other adjective orders; pairs for which one particular order is far more common that can be resolved from frequency alone; and orders with high variance.
Figure 2: A--B: AOP-% and AOP-$\Delta$ scores for Pythia models of increasing size. C--D: AOP-% and AOP-$\Delta$ scores for Pythia-1.4b during training. We highlight the three learning phases: 1) initialization, 2) acquisition, and 3) consolidation.
Figure 3: Average token probabilities for the original and swapped adjective orders on Pythia-12b, without and with sentence context (A--B), as well as the token-level differences (C) that correspond to the difference between the curves in (A) and (B).
Figure 4: Correlations during training of LM probabilities for single adjectives, adjective pairs, and adjective-noun triplets with respect to their frequency in The Pile.
Figure 5: The contextual AOP-% performance for Pythia-70m and 1.4b across training, split out for items that have been seen 0, 1, 2 to 10, and more than 10 times at each specific checkpoint. We provide the size of these splits across training in Appendix \ref{['app:split-sizes']}.
...and 5 more figures

Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

TL;DR

Abstract

Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (10)