Table of Contents
Fetching ...

Do Language Models' Words Refer?

Matthew Mandelkern, Tal Linzen

TL;DR

The paper investigates whether language-model words can refer to external entities, challenging the view that referentiality requires embodied interaction with the world. It adopts an externalist semantic framework, arguing that the referential success of LM outputs can derive from the natural histories of textual inputs and a speaker community’s usage, not from internal beliefs or sensorimotor grounding. Through thought experiments like Luke’s use of 'Peano' and Putnam’s Twin Earth, the authors illustrate how reference can be determined by causal-historical usage rather than speaker intention. The work highlights implications for AI/NLP, suggesting that LMs may ground reference through linguistic history and deference, while remaining open to alternative accounts and further research.

Abstract

What do language models (LMs) do with language? Everyone agrees that they can produce sequences of (mostly) coherent strings of English. But do those sentences mean something, or are LMs simply babbling in a convincing simulacrum of language use? Here we will address one aspect of this broad question: whether LMs' words can refer, that is, achieve "word-to-world" connections. There is prima facie reason to think they do not since LMs do not interact with the world in the way that ordinary language users do. Drawing on insights from the externalist tradition in philosophy of language, we argue that those appearances are misleading: even if the inputs to an LM are simply strings of text, they are strings of text with natural histories, and that may suffice to put LMs' words into referential contact with the external world.

Do Language Models' Words Refer?

TL;DR

The paper investigates whether language-model words can refer to external entities, challenging the view that referentiality requires embodied interaction with the world. It adopts an externalist semantic framework, arguing that the referential success of LM outputs can derive from the natural histories of textual inputs and a speaker community’s usage, not from internal beliefs or sensorimotor grounding. Through thought experiments like Luke’s use of 'Peano' and Putnam’s Twin Earth, the authors illustrate how reference can be determined by causal-historical usage rather than speaker intention. The work highlights implications for AI/NLP, suggesting that LMs may ground reference through linguistic history and deference, while remaining open to alternative accounts and further research.

Abstract

What do language models (LMs) do with language? Everyone agrees that they can produce sequences of (mostly) coherent strings of English. But do those sentences mean something, or are LMs simply babbling in a convincing simulacrum of language use? Here we will address one aspect of this broad question: whether LMs' words can refer, that is, achieve "word-to-world" connections. There is prima facie reason to think they do not since LMs do not interact with the world in the way that ordinary language users do. Drawing on insights from the externalist tradition in philosophy of language, we argue that those appearances are misleading: even if the inputs to an LM are simply strings of text, they are strings of text with natural histories, and that may suffice to put LMs' words into referential contact with the external world.
Paper Structure (8 sections)