Table of Contents
Fetching ...

A Quantum-Inspired Analysis of Human Disambiguation Processes

Daphne Wang

TL;DR

This work develops a quantum-inspired framework to model human disambiguation in English, using category-theoretic tools (sheaves, presheaves, and monoidal categories) and contextuality formalisms to connect lexical/syntactic ambiguities with quantum-like statistics. By analyzing corpus and human-judgment data through Contextuality-by-Default and the sheaf-theoretic approach, the study uncovers quantum-like contextuality in lexical phrases and notable causal structure in verb-driven disambiguation. It also demonstrates that quantum simulations and variational circuits can approximate human disambiguation patterns, offering a pathway to quantum-native NLP methods and potential advantages in certain linguistic tasks. The findings illuminate how contextuality and causality frameworks capture human parsing dynamics and reading-time effects, suggesting new directions for NLP models that leverage quantum-inspired representations and causally structured data.

Abstract

Formal languages are essential for computer programming and are constructed to be easily processed by computers. In contrast, natural languages are much more challenging and instigated the field of Natural Language Processing (NLP). One major obstacle is the ubiquity of ambiguities. Recent advances in NLP have led to the development of large language models, which can resolve ambiguities with high accuracy. At the same time, quantum computers have gained much attention in recent years as they can solve some computational problems faster than classical computers. This new computing paradigm has reached the fields of machine learning and NLP, where hybrid classical-quantum learning algorithms have emerged. However, more research is needed to identify which NLP tasks could benefit from a genuine quantum advantage. In this thesis, we applied formalisms arising from foundational quantum mechanics, such as contextuality and causality, to study ambiguities arising from linguistics. By doing so, we also reproduced psycholinguistic results relating to the human disambiguation process. These results were subsequently used to predict human behaviour and outperformed current NLP methods.

A Quantum-Inspired Analysis of Human Disambiguation Processes

TL;DR

This work develops a quantum-inspired framework to model human disambiguation in English, using category-theoretic tools (sheaves, presheaves, and monoidal categories) and contextuality formalisms to connect lexical/syntactic ambiguities with quantum-like statistics. By analyzing corpus and human-judgment data through Contextuality-by-Default and the sheaf-theoretic approach, the study uncovers quantum-like contextuality in lexical phrases and notable causal structure in verb-driven disambiguation. It also demonstrates that quantum simulations and variational circuits can approximate human disambiguation patterns, offering a pathway to quantum-native NLP methods and potential advantages in certain linguistic tasks. The findings illuminate how contextuality and causality frameworks capture human parsing dynamics and reading-time effects, suggesting new directions for NLP models that leverage quantum-inspired representations and causally structured data.

Abstract

Formal languages are essential for computer programming and are constructed to be easily processed by computers. In contrast, natural languages are much more challenging and instigated the field of Natural Language Processing (NLP). One major obstacle is the ubiquity of ambiguities. Recent advances in NLP have led to the development of large language models, which can resolve ambiguities with high accuracy. At the same time, quantum computers have gained much attention in recent years as they can solve some computational problems faster than classical computers. This new computing paradigm has reached the fields of machine learning and NLP, where hybrid classical-quantum learning algorithms have emerged. However, more research is needed to identify which NLP tasks could benefit from a genuine quantum advantage. In this thesis, we applied formalisms arising from foundational quantum mechanics, such as contextuality and causality, to study ambiguities arising from linguistics. By doing so, we also reproduced psycholinguistic results relating to the human disambiguation process. These results were subsequently used to predict human behaviour and outperformed current NLP methods.
Paper Structure (180 sections, 11 theorems, 301 equations, 43 figures, 9 tables)

This paper contains 180 sections, 11 theorems, 301 equations, 43 figures, 9 tables.

Key Result

Proposition 1.44

For a cyclic system with binary random variables taking values in $\{\pm1\}$, we have: where $\Delta^*_{c_q,c'_q}$ is the minimum direct influence of the contexts $c_q,c'_q$ associated with content $q$ across all canonical models compatible with the observed distributions.

Figures (43)

  • Figure 1: Illustration of the restriction morphsims of a presheaf.
  • Figure 2: Illustration of the general presheaf structure over intersecting sets. If there exists a gluing between two sections in $PU$ and $PV$, then there will be an intersection between they will coincide in the two dashed regions $\left.PU\right|_{U\cap V}$ and $\left.PV\right|_{U\cap V}$.
  • Figure 3: Causal diagram of a Bell experiment. Events are represented as dot and future light-cones are represented as triangles.
  • Figure 4: Bayesian Network representation of a canonical causal model.
  • Figure 5: Correspondance between the original measurement scenario (left), and the consistentified one (right). On the latter, the solid measurement contexts are the ones inherited from the left-hand measurement scenario, whilst the dashed ones are the ones created from the minimal direct influence condition.
  • ...and 38 more figures

Theorems & Definitions (100)

  • Definition 1.1: Category
  • Example 1.2
  • Definition 1.3
  • Definition 1.4
  • Example 1.5
  • Definition 1.6
  • Example 1.7
  • Definition 1.8
  • Definition 1.9: Functor category
  • Definition 1.10
  • ...and 90 more