Black Big Boxes: Tracing Adjective Order Preferences in Large Language Models
Jaap Jumelet, Lisa Bylinina, Willem Zuidema, Jakub Szymanik
TL;DR
The paper investigates how adjective order preferences in large language models relate to training data statistics and sentence context. Using the CAP corpus and the Pythia model suite, it shows that AOPs largely align with $n$-gram frequencies and emerge early in training, but also persist in unseen orders, indicating generalization beyond memorization. Contextual cues—both local collocations and earlier semantic hints—systematically influence AOP, as revealed through attribution analyses. The work suggests a nuanced picture where distributional learning, generalization, and contextual information jointly shape word-order preferences in LMs, offering methodological tools for probing human-language phenomena with neural models.
Abstract
In English and other languages, multiple adjectives in noun phrases follow intricate ordering patterns. These patterns have been widely studied in linguistics and provide a useful test case for assessing how language models (LMs) acquire graded and context-sensitive word order preferences. We ask to what extent adjective order preferences in LMs can be explained by distributional learning alone, and where models exhibit behaviour that goes beyond surface co-occurrence patterns. We find that LM predictions are largely explained by training data frequencies: simple n-gram statistics account for much of their behaviour and closely mirror the preferences learned during training. However, by analysing learning dynamics we reveal that models also generalize robustly to unseen adjective combinations, indicating that their behaviour cannot be reduced to memorization of observed orders alone. Moreover, we show how LMs leverage word order cues from sentence context, demonstrating with feature attribution methods that contextual cues are an additional driver of adjective order in LM output.
