Table of Contents
Fetching ...

A Grounded Typology of Word Classes

Coleman Haley, Sharon Goldwater, Edoardo Ponti

TL;DR

This work introduces a grounded typology by using images as language-neutral proxies for meaning and quantifying semantic contentfulness through a PMI-based groundedness measure. It compares image-conditioned captioning with text-only language models to estimate how much meaning is captured by word classes across 30 languages, revealing a robust lexical–functional gradient with nouns, adjectives, and verbs generally more grounded than functional classes. The approach yields a dataset of groundedness scores and demonstrates partial alignment with psycholinguistic concreteness norms, while challenging some assumptions about adpositions and semantic content in function words. Overall, the method provides a quantitative, cross-linguistic tool for studying semantic function in language, with potential for broader multimodal typology research and future data/model improvements.

Abstract

We propose a grounded approach to meaning in language typology. We treat data from perceptual modalities, such as images, as a language-agnostic representation of meaning. Hence, we can quantify the function--form relationship between images and captions across languages. Inspired by information theory, we define "groundedness", an empirical measure of contextual semantic contentfulness (formulated as a difference in surprisal) which can be computed with multilingual multimodal language models. As a proof of concept, we apply this measure to the typology of word classes. Our measure captures the contentfulness asymmetry between functional (grammatical) and lexical (content) classes across languages, but contradicts the view that functional classes do not convey content. Moreover, we find universal trends in the hierarchy of groundedness (e.g., nouns > adjectives > verbs), and show that our measure partly correlates with psycholinguistic concreteness norms in English. We release a dataset of groundedness scores for 30 languages. Our results suggest that the grounded typology approach can provide quantitative evidence about semantic function in language.

A Grounded Typology of Word Classes

TL;DR

This work introduces a grounded typology by using images as language-neutral proxies for meaning and quantifying semantic contentfulness through a PMI-based groundedness measure. It compares image-conditioned captioning with text-only language models to estimate how much meaning is captured by word classes across 30 languages, revealing a robust lexical–functional gradient with nouns, adjectives, and verbs generally more grounded than functional classes. The approach yields a dataset of groundedness scores and demonstrates partial alignment with psycholinguistic concreteness norms, while challenging some assumptions about adpositions and semantic content in function words. Overall, the method provides a quantitative, cross-linguistic tool for studying semantic function in language, with potential for broader multimodal typology research and future data/model improvements.

Abstract

We propose a grounded approach to meaning in language typology. We treat data from perceptual modalities, such as images, as a language-agnostic representation of meaning. Hence, we can quantify the function--form relationship between images and captions across languages. Inspired by information theory, we define "groundedness", an empirical measure of contextual semantic contentfulness (formulated as a difference in surprisal) which can be computed with multilingual multimodal language models. As a proof of concept, we apply this measure to the typology of word classes. Our measure captures the contentfulness asymmetry between functional (grammatical) and lexical (content) classes across languages, but contradicts the view that functional classes do not convey content. Moreover, we find universal trends in the hierarchy of groundedness (e.g., nouns > adjectives > verbs), and show that our measure partly correlates with psycholinguistic concreteness norms in English. We release a dataset of groundedness scores for 30 languages. Our results suggest that the grounded typology approach can provide quantitative evidence about semantic function in language.

Paper Structure

This paper contains 27 sections, 5 equations, 5 figures, 1 table.

Figures (5)

  • Figure 1: Mean and standard deviation of per-language mutual information estimates between word class and image. Across 30 languages, we see clear and consistent tendencies about which parts of speech are more "grounded", corresponding to a distinction between lexical and functional classes.
  • Figure 2: Heatmap of mutual information estimates across parts of speech in thirty languages. Cells show the statistical significance of a word class's groundedness (MI > 0). Unattested classes are white. Some functional classes display non-significant levels of groundedness in several languages, while lexical classes dominantly show highly significant grounding.
  • Figure 3: Word token level distributions of the groundedness measure (PMI) across all languages and datasets, grouped by part of speech (word class). We also report the estimated marginal mean and ranking of each word class. Colors are based on the ranking of classes, rather than their average PMIs. Overall, the distribution and estimated ranking of word classes strongly suggest our groundedness measure quantitatively captures the distinction between lexical and functional classes.
  • Figure 4: Correlation between human concreteness ratings and type-level groundedness (PMI; left, $\rho\,{=}\,0.368$) or uncertainty coefficent (right, $\rho\,{=}\,0.609$): i.e., the average ratio between LM surprisal and captioning model surprisal.
  • Figure 5: Correlation between English psycholinguistic norms and type-level groundedness (left) or uncertainty coefficent (right): i.e., the average ratio between LM surprisal and captioning model surprisal. Type-level measures were computed by averaging scores across the COCO-dev dataset for types which occur at least 30 times.