Table of Contents
Fetching ...

Density Matrices for Metaphor Understanding

Jay Owers, Ekaterina Shutova, Martha Lewis

TL;DR

This work investigates modeling metaphor as lexical ambiguity using density matrices within a categorical compositional framework, testing whether mixtures of word senses can capture metaphorical meaning. It integrates the CPM construction with DisCo to produce sentence meanings as density operators, learning word-density matrices from text via Multi-sense Word2DM and applying composition operators such as Add, Mult, Fuzz, and Phaser. Through a newly created metaphor-disambiguation dataset, the study finds that metaphor is notably hard to model and neuro-based encoders underperform, while some density-matrix methods yield modest improvements over baselines, with the best results using a 10-sense Word2DM and Mult composition. The results offer insights into how context and operator choices influence disambiguation and point to future work in expanding model coverage, testing modern language models, and linking ambiguity with hyponymy.

Abstract

In physics, density matrices are used to represent mixed states, i.e. probabilistic mixtures of pure states. This concept has previously been used to model lexical ambiguity. In this paper, we consider metaphor as a type of lexical ambiguity, and examine whether metaphorical meaning can be effectively modelled using mixtures of word senses. We find that modelling metaphor is significantly more difficult than other kinds of lexical ambiguity, but that our best-performing density matrix method outperforms simple baselines as well as some neural language models.

Density Matrices for Metaphor Understanding

TL;DR

This work investigates modeling metaphor as lexical ambiguity using density matrices within a categorical compositional framework, testing whether mixtures of word senses can capture metaphorical meaning. It integrates the CPM construction with DisCo to produce sentence meanings as density operators, learning word-density matrices from text via Multi-sense Word2DM and applying composition operators such as Add, Mult, Fuzz, and Phaser. Through a newly created metaphor-disambiguation dataset, the study finds that metaphor is notably hard to model and neuro-based encoders underperform, while some density-matrix methods yield modest improvements over baselines, with the best results using a 10-sense Word2DM and Mult composition. The results offer insights into how context and operator choices influence disambiguation and point to future work in expanding model coverage, testing modern language models, and linking ambiguity with hyponymy.

Abstract

In physics, density matrices are used to represent mixed states, i.e. probabilistic mixtures of pure states. This concept has previously been used to model lexical ambiguity. In this paper, we consider metaphor as a type of lexical ambiguity, and examine whether metaphorical meaning can be effectively modelled using mixtures of word senses. We find that modelling metaphor is significantly more difficult than other kinds of lexical ambiguity, but that our best-performing density matrix method outperforms simple baselines as well as some neural language models.
Paper Structure (27 sections, 1 theorem, 25 equations, 3 figures, 6 tables)

This paper contains 27 sections, 1 theorem, 25 equations, 3 figures, 6 tables.

Key Result

Theorem 1

$\mathbf{CPM}(\mathcal{C})$ is also a $\dagger$-compact closed category. There is a functor: This functor preserves the $\dagger$-compact closed structure, and is faithful "up to a global phase".

Figures (3)

  • Figure 1: Participants are asked to choose which of paraphrase 1 or 2 is the best.
  • Figure 2: Participants are asked to rate the similarity of each paraphrase to the target sentence
  • Figure 3: Analysis of which sentences were scored correctly by ms-Word2DM-d10. Note that verb operator models scored a lot of sentences the same as the verb, meaning composition had little effect.

Theorems & Definitions (5)

  • Definition 1: Completely positive morphism selinger
  • Definition 2: $\mathbf{CPM}(\mathcal{C})$ selinger
  • Theorem 1: Compact Closure selinger
  • Example 1
  • Example 2