Table of Contents
Fetching ...

NeLLCom-Lex: A Neural-agent Framework to Study the Interplay between Lexical Systems and Language Use

Yuqing Zhang, Ecesu Ürker, Tessa Verhoef, Gemma Boleda, Arianna Bisazza

TL;DR

This work introduces NeLLCom-Lex, a neural-agent framework grounded in an existing lexical system to simulate semantic change under varying communicative needs, using a color naming task across a single generation. By combining supervised pretraining with reinforcement learning in a grounded referential game, the authors demonstrate that agents can reproduce human-like pragmatic naming and develop robust, adaptable lexicons. The framework allows explicit manipulation of contextual demands and communication pressures to study how use and lexicon co-evolve, providing a causal modeling tool for semantic change mechanisms. The results highlight that context access and communicative need drive both in-context naming and global lexicon properties, offering a path toward mechanistic insights into lexical semantic evolution.

Abstract

Lexical semantic change has primarily been investigated with observational and experimental methods; however, observational methods (corpus analysis, distributional semantic modeling) cannot get at causal mechanisms, and experimental paradigms with humans are hard to apply to semantic change due to the extended diachronic processes involved. This work introduces NeLLCom-Lex, a neural-agent framework designed to simulate semantic change by first grounding agents in a real lexical system (e.g. English) and then systematically manipulating their communicative needs. Using a well-established color naming task, we simulate the evolution of a lexical system within a single generation, and study which factors lead agents to: (i) develop human-like naming behavior and lexicons, and (ii) change their behavior and lexicons according to their communicative needs. Our experiments with different supervised and reinforcement learning pipelines show that neural agents trained to 'speak' an existing language can reproduce human-like patterns in color naming to a remarkable extent, supporting the further use of NeLLCom-Lex to elucidate the mechanisms of semantic change.

NeLLCom-Lex: A Neural-agent Framework to Study the Interplay between Lexical Systems and Language Use

TL;DR

This work introduces NeLLCom-Lex, a neural-agent framework grounded in an existing lexical system to simulate semantic change under varying communicative needs, using a color naming task across a single generation. By combining supervised pretraining with reinforcement learning in a grounded referential game, the authors demonstrate that agents can reproduce human-like pragmatic naming and develop robust, adaptable lexicons. The framework allows explicit manipulation of contextual demands and communication pressures to study how use and lexicon co-evolve, providing a causal modeling tool for semantic change mechanisms. The results highlight that context access and communicative need drive both in-context naming and global lexicon properties, offering a path toward mechanistic insights into lexical semantic evolution.

Abstract

Lexical semantic change has primarily been investigated with observational and experimental methods; however, observational methods (corpus analysis, distributional semantic modeling) cannot get at causal mechanisms, and experimental paradigms with humans are hard to apply to semantic change due to the extended diachronic processes involved. This work introduces NeLLCom-Lex, a neural-agent framework designed to simulate semantic change by first grounding agents in a real lexical system (e.g. English) and then systematically manipulating their communicative needs. Using a well-established color naming task, we simulate the evolution of a lexical system within a single generation, and study which factors lead agents to: (i) develop human-like naming behavior and lexicons, and (ii) change their behavior and lexicons according to their communicative needs. Our experiments with different supervised and reinforcement learning pipelines show that neural agents trained to 'speak' an existing language can reproduce human-like patterns in color naming to a remarkable extent, supporting the further use of NeLLCom-Lex to elucidate the mechanisms of semantic change.

Paper Structure

This paper contains 45 sections, 7 equations, 14 figures, 5 tables.

Figures (14)

  • Figure 1: Overview of agent architectures and the referential communication game. The speaking agent is presented with a target color ($c_t$) and, depending on the experimental condition, may also receive two distractor colors interpreted as context (dashed box). The listening agent always receives three colors and has to guess the index $i$ of the correct one ($c_t$), in this case 2. Both agents are first trained by SL on a dataset of human interactions.
  • Figure 2: Mean speaking and listening accuracy on test$_{hum}$, and mean communication accuracy on test$_{gen,distH}$ as a function of training epochs across different training pipelines. In the right-most plot, epoch 0 denotes the state immediately after SL and before RL. Shaded regions indicate 95% confidence intervals. All values are averaged over 10 random seeds.
  • Figure 3: Word informativeness as a function of context ease on test$_{gen,distH}$ for 2 representative seeds.
  • Figure 4: Communication accuracy on test$_{gen,dist50}$ for varying RL training distributions, as a function of RL training epoch. Each plot shows accuracy on a different test subset: Overall (O), Far (F), and Close (C).
  • Figure 5: Denotation in the CIELAB color space for pink in the three RL training conditions.
  • ...and 9 more figures