Table of Contents
Fetching ...

The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect

Cosimo Iaia, Bhavin Choksi, Emily Wiebers, Gemma Roig, Christian J. Fiebach

TL;DR

The paper investigates whether the concreteness dimension is similarly represented in humans and language-model embeddings. Using Representational Similarity Analysis on an odd-one-out behavioral space and multiple German word embeddings, the authors show significant human–LM alignment that is largely driven by concreteness. An ablation approach demonstrates that removing concreteness from embeddings yields the largest drop in alignment, indicating a shared, implicit concreteness representation despite no explicit training on concreteness. These findings imply a convergent organization of concreteness in human and machine semantic representations and highlight the need to consider concreteness in cross-domain semantic modeling, with future work extending to neural data and broader vocabularies.

Abstract

The nouns of our language refer to either concrete entities (like a table) or abstract concepts (like justice or love), and cognitive psychology has established that concreteness influences how words are processed. Accordingly, understanding how concreteness is represented in our mind and brain is a central question in psychology, neuroscience, and computational linguistics. While the advent of powerful language models has allowed for quantitative inquiries into the nature of semantic representations, it remains largely underexplored how they represent concreteness. Here, we used behavioral judgments to estimate semantic distances implicitly used by humans, for a set of carefully selected abstract and concrete nouns. Using Representational Similarity Analysis, we find that the implicit representational space of participants and the semantic representations of language models are significantly aligned. We also find that both representational spaces are implicitly aligned to an explicit representation of concreteness, which was obtained from our participants using an additional concreteness rating task. Importantly, using ablation experiments, we demonstrate that the human-to-model alignment is substantially driven by concreteness, but not by other important word characteristics established in psycholinguistics. These results indicate that humans and language models converge on the concreteness dimension, but not on other dimensions.

The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect

TL;DR

The paper investigates whether the concreteness dimension is similarly represented in humans and language-model embeddings. Using Representational Similarity Analysis on an odd-one-out behavioral space and multiple German word embeddings, the authors show significant human–LM alignment that is largely driven by concreteness. An ablation approach demonstrates that removing concreteness from embeddings yields the largest drop in alignment, indicating a shared, implicit concreteness representation despite no explicit training on concreteness. These findings imply a convergent organization of concreteness in human and machine semantic representations and highlight the need to consider concreteness in cross-domain semantic modeling, with future work extending to neural data and broader vocabularies.

Abstract

The nouns of our language refer to either concrete entities (like a table) or abstract concepts (like justice or love), and cognitive psychology has established that concreteness influences how words are processed. Accordingly, understanding how concreteness is represented in our mind and brain is a central question in psychology, neuroscience, and computational linguistics. While the advent of powerful language models has allowed for quantitative inquiries into the nature of semantic representations, it remains largely underexplored how they represent concreteness. Here, we used behavioral judgments to estimate semantic distances implicitly used by humans, for a set of carefully selected abstract and concrete nouns. Using Representational Similarity Analysis, we find that the implicit representational space of participants and the semantic representations of language models are significantly aligned. We also find that both representational spaces are implicitly aligned to an explicit representation of concreteness, which was obtained from our participants using an additional concreteness rating task. Importantly, using ablation experiments, we demonstrate that the human-to-model alignment is substantially driven by concreteness, but not by other important word characteristics established in psycholinguistics. These results indicate that humans and language models converge on the concreteness dimension, but not on other dimensions.

Paper Structure

This paper contains 16 sections, 4 figures, 1 table.

Figures (4)

  • Figure 1: Schematic of the approach: Words are sampled from the semantic space (top left) and used for an odd-one-out task. The English translations of the German words are Körper : Body, Gesicht : Face, Figur : Figure. For a set of 40 words, we collected a total of 9880 odd-one-out choices. These were converted into a representational dissimilarity matrix (RDM) reflecting pair-wise semantic distances between words. For each language model, a similar computational RDM is created using the word embeddings for the 40 words. The two RDMs are then compared with each other.
  • Figure 2: Partial correlations for the behavioral model (odd-one-out) and the computational models (language models): The representational space derived from the odd-on-out (in blue) is only correlated to the rated concreteness space, while language models (other colors) are aligned to other feature spaces as well. The representational spaces derived from all language models but GPT2 (in red) show alignment not only to concreteness but also to word frequency. GPT2, instead, is correlated to word length and OLD20. (*** p < .001, ** p < .01, * p < .05).
  • Figure 3: Representational Similarity Analysis after removing each feature: Compared to the original correlations between the non-ablated computational representation (lightest shade of blue) and the representation derived from the odd-one-out task, the biggest drop is observed when removing concreteness (dark blue) for all language models. (Williams' test, *** p < .001, ** p < .01, * p < .05)
  • Figure 4: Control Analysis: Representational Similarity analysis after removing further semantic features. The ablation approach reported in the main paper was repeated with further semantic dimensions, i.e., word imageability, word arousal, and word valence. Compared to the base correlations, removing concreteness resulted in the biggest drop for all language models.