Table of Contents
Fetching ...

Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models

Xinyu Zhou, Delong Chen, Samuel Cahyawijaya, Xufeng Duan, Zhenguang G. Cai

TL;DR

This work presents a minimal-pair activation-difference probing method to quantify internal linguistic representations in large language models. It defines linguistic similarity as the cosine similarity between activation-difference vectors $\Delta z$ derived from grammatically correct and incorrect sentences across 150k minimal pairs from BLiMP, SLING, and RuBLiMP, evaluated over 100+ LLMs and three languages. Key findings show stronger cross-LLM alignment in higher-resource languages, close alignment with fine-grained linguistic categories but weaker with semantic similarity, and partial but notable cross-lingual coherence with language-specific clustering. The study provides a quantitative bridge between neural representations and linguistic theory, contributing data and code publicly to enable further exploration of LLM linguistic knowledge.

Abstract

We introduce a novel analysis that leverages linguistic minimal pairs to probe the internal linguistic representations of Large Language Models (LLMs). By measuring the similarity between LLM activation differences across minimal pairs, we quantify the and gain insight into the linguistic knowledge captured by LLMs. Our large-scale experiments, spanning 100+ LLMs and 150k minimal pairs in three languages, reveal properties of linguistic similarity from four key aspects: consistency across LLMs, relation to theoretical categorizations, dependency to semantic context, and cross-lingual alignment of relevant phenomena. Our findings suggest that 1) linguistic similarity is significantly influenced by training data exposure, leading to higher cross-LLM agreement in higher-resource languages. 2) Linguistic similarity strongly aligns with fine-grained theoretical linguistic categories but weakly with broader ones. 3) Linguistic similarity shows a weak correlation with semantic similarity, showing its context-dependent nature. 4) LLMs exhibit limited cross-lingual alignment in their understanding of relevant linguistic phenomena. This work demonstrates the potential of minimal pairs as a window into the neural representations of language in LLMs, shedding light on the relationship between LLMs and linguistic theory. Codes and data are available at https://github.com/ChenDelong1999/Linguistic-Similarity

Linguistic Minimal Pairs Elicit Linguistic Similarity in Large Language Models

TL;DR

This work presents a minimal-pair activation-difference probing method to quantify internal linguistic representations in large language models. It defines linguistic similarity as the cosine similarity between activation-difference vectors derived from grammatically correct and incorrect sentences across 150k minimal pairs from BLiMP, SLING, and RuBLiMP, evaluated over 100+ LLMs and three languages. Key findings show stronger cross-LLM alignment in higher-resource languages, close alignment with fine-grained linguistic categories but weaker with semantic similarity, and partial but notable cross-lingual coherence with language-specific clustering. The study provides a quantitative bridge between neural representations and linguistic theory, contributing data and code publicly to enable further exploration of LLM linguistic knowledge.

Abstract

We introduce a novel analysis that leverages linguistic minimal pairs to probe the internal linguistic representations of Large Language Models (LLMs). By measuring the similarity between LLM activation differences across minimal pairs, we quantify the and gain insight into the linguistic knowledge captured by LLMs. Our large-scale experiments, spanning 100+ LLMs and 150k minimal pairs in three languages, reveal properties of linguistic similarity from four key aspects: consistency across LLMs, relation to theoretical categorizations, dependency to semantic context, and cross-lingual alignment of relevant phenomena. Our findings suggest that 1) linguistic similarity is significantly influenced by training data exposure, leading to higher cross-LLM agreement in higher-resource languages. 2) Linguistic similarity strongly aligns with fine-grained theoretical linguistic categories but weakly with broader ones. 3) Linguistic similarity shows a weak correlation with semantic similarity, showing its context-dependent nature. 4) LLMs exhibit limited cross-lingual alignment in their understanding of relevant linguistic phenomena. This work demonstrates the potential of minimal pairs as a window into the neural representations of language in LLMs, shedding light on the relationship between LLMs and linguistic theory. Codes and data are available at https://github.com/ChenDelong1999/Linguistic-Similarity
Paper Structure (18 sections, 17 figures, 2 tables)

This paper contains 18 sections, 17 figures, 2 tables.

Figures (17)

  • Figure 1: The process of measuring linguistic similarity in an LLM. We extract LLM activations for sentences in linguistic minimal pairs and compute their differences. Since the sentences differ solely in a specific linguistic phenomenon, the resulting difference only contains information about that phenomenon. We then measure the similarity between these activation differences, which we refer to as linguistic similarity.
  • Figure 2: The relationship between linguistic similarity across LLMs. In English, LLMs form a single cluster, while in Chinese, two distinct clusters emerge: one for bilingual and multilingual LLMs, and another for English-only models. Detailed visualizations can be found in Appendix \ref{['appendix:llm_alignment']}.
  • Figure 3: Distribution of LLM alignment scores, with red dotted lines marks the average scores of 0.471 (BLiMP), 0.414 (SLING), and 0.139 (RuBLiMP).
  • Figure 4: Intra-class and inter-class linguistic similarities at different levels of linguistic classification. At the most fine-grained level (1st level), intra-class similarities are significantly higher than inter-class similarities, indicating a strong alignment with detailed theoretical linguistic categorizations. As we move to broader categories (2nd and 3rd levels), the gap between inner and inter-class similarities narrows notably.
  • Figure 5: Phenomena-level linguistic similarity matrix of BLiMP. Each grid corresponds to the average similarity between two linguistic phenomena. The categorization in the 2nd-level (linguistic terms) and the 3rd-level are respectively separated by dashed and bold black lines. On the left, we provide label of the 1st to 3rd levels of linguistic classifications, separated by "|". Visualizations of SLING and RuBLiMP can be found in Appendix \ref{['appendix:Phenomena Similarities']}.
  • ...and 12 more figures