Table of Contents
Fetching ...

Mathematical Foundations for a Compositional Distributional Model of Meaning

Bob Coecke, Mehrnoosh Sadrzadeh, Stephen Clark

TL;DR

This work tackles the problem of integrating distributional word meanings with a compositional grammar by framing meaning in a category-theoretic setting. It introduces a unified model in the product category $\mathbf{FVect} \times P$, where word meanings are vectors and grammatical types control sentence composition via a compact closed structure, producing sentence meanings in a single space $S$. The authors demonstrate concrete computations for positive and negative transitive sentences and show how restricting to the Boolean semiring yields a Montague-style semantics, while also discussing semiring and relational variants. The diagrammatic calculus clarifies information flow in sentence composition and the framework sets a path toward practical implementation and deeper connections with Montague semantics and connectionist approaches.

Abstract

We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are `lifted' to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic calculus which exposes how the information flows between the words in a sentence in order to make up the meaning of the whole sentence. A variation of our `categorical model' which involves constraining the scalars of the vector spaces to the semiring of Booleans results in a Montague-style Boolean-valued semantics.

Mathematical Foundations for a Compositional Distributional Model of Meaning

TL;DR

This work tackles the problem of integrating distributional word meanings with a compositional grammar by framing meaning in a category-theoretic setting. It introduces a unified model in the product category , where word meanings are vectors and grammatical types control sentence composition via a compact closed structure, producing sentence meanings in a single space . The authors demonstrate concrete computations for positive and negative transitive sentences and show how restricting to the Boolean semiring yields a Montague-style semantics, while also discussing semiring and relational variants. The diagrammatic calculus clarifies information flow in sentence composition and the framework sets a path toward practical implementation and deeper connections with Montague semantics and connectionist approaches.

Abstract

We propose a mathematical framework for a unification of the distributional theory of meaning in terms of vector space models, and a compositional theory for grammatical types, for which we rely on the algebra of Pregroups, introduced by Lambek. This mathematical framework enables us to compute the meaning of a well-typed sentence from the meanings of its constituents. Concretely, the type reductions of Pregroups are `lifted' to morphisms in a category, a procedure that transforms meanings of constituents into a meaning of the (well-typed) whole. Importantly, meanings of whole sentences live in a single space, independent of the grammatical structure of the sentence. Hence the inner-product can be used to compare meanings of arbitrary sentences, as it is for comparing the meanings of words in the distributional model. The mathematical structure we employ admits a purely diagrammatic calculus which exposes how the information flows between the words in a sentence in order to make up the meaning of the whole sentence. A variation of our `categorical model' which involves constraining the scalars of the vector spaces to the semiring of Booleans results in a Montague-style Boolean-valued semantics.

Paper Structure

This paper contains 25 sections, 70 equations.

Theorems & Definitions (3)

  • Definition 3.1
  • Definition 3.2
  • Definition 5.1