Table of Contents
Fetching ...

Sphere Neural-Networks for Rational Reasoning

Tiansi Dong, Mateja Jamnik, Pietro Liò

TL;DR

Sphere Neural Networks (SphNN) introduce a deterministic, neuro-symbolic framework that represents concepts as spheres and set-theoretic relations via boundary interactions, enabling one-epoch satisfiability checks for long-chained syllogisms with $O(N)$ complexity. By embedding syllogistic statements as spatial relations and using a Kolmogorov-Arnold-style neural transition map within a three-layer GPS-like architecture, SphNN performs model-construction and inspection to verify validity, bridging vector semantics with explicit set-theoretic boundaries. The approach demonstrates deterministic neural syllogistic reasoning, supports neuro-symbolic unification with latent vector embeddings, and extends toward spatio-temporal, probabilistic, and even humour reasoning, offering a potential neural implementation of Herbert A. Simon’s neural scissors. Empirical results show SphNN can construct counter-models for invalid deductions and, via interaction with LLMs like ChatGPT, can improve reasoning reliability, albeit with higher computational cost. Overall, SphNN provides a principled path to deterministic, explainable neural reasoning that complements and constrains large language models, grounding high-level cognition in geometric, boundary-based representations with scalable theoretical guarantees.

Abstract

The success of Large Language Models (LLMs), e.g., ChatGPT, is witnessed by their planetary popularity, their capability of human-like communication, and also by their steadily improved reasoning performance. However, it remains unclear whether LLMs reason. It is an open problem how traditional neural networks can be qualitatively extended to go beyond the statistic paradigm and achieve high-level cognition. Here, we present a novel qualitative extension by generalising computational building blocks from vectors to spheres. We propose Sphere Neural Networks (SphNNs) for human-like reasoning through model construction and inspection, and develop SphNN for syllogistic reasoning, a microcosm of human rationality. SphNN is a hierarchical neuro-symbolic Kolmogorov-Arnold geometric GNN, and uses a neuro-symbolic transition map of neighbourhood spatial relations to transform the current sphere configuration towards the target. SphNN is the first neural model that can determine the validity of long-chained syllogistic reasoning in one epoch without training data, with the worst computational complexity of O(N). SphNN can evolve into various types of reasoning, such as spatio-temporal reasoning, logical reasoning with negation and disjunction, event reasoning, neuro-symbolic unification, and humour understanding (the highest level of cognition). All these suggest a new kind of Herbert A. Simon's scissors with two neural blades. SphNNs will tremendously enhance interdisciplinary collaborations to develop the two neural blades and realise deterministic neural reasoning and human-bounded rationality and elevate LLMs to reliable psychological AI. This work suggests that the non-zero radii of spheres are the missing components that prevent traditional deep-learning systems from reaching the realm of rational reasoning and cause LLMs to be trapped in the swamp of hallucination.

Sphere Neural-Networks for Rational Reasoning

TL;DR

Sphere Neural Networks (SphNN) introduce a deterministic, neuro-symbolic framework that represents concepts as spheres and set-theoretic relations via boundary interactions, enabling one-epoch satisfiability checks for long-chained syllogisms with complexity. By embedding syllogistic statements as spatial relations and using a Kolmogorov-Arnold-style neural transition map within a three-layer GPS-like architecture, SphNN performs model-construction and inspection to verify validity, bridging vector semantics with explicit set-theoretic boundaries. The approach demonstrates deterministic neural syllogistic reasoning, supports neuro-symbolic unification with latent vector embeddings, and extends toward spatio-temporal, probabilistic, and even humour reasoning, offering a potential neural implementation of Herbert A. Simon’s neural scissors. Empirical results show SphNN can construct counter-models for invalid deductions and, via interaction with LLMs like ChatGPT, can improve reasoning reliability, albeit with higher computational cost. Overall, SphNN provides a principled path to deterministic, explainable neural reasoning that complements and constrains large language models, grounding high-level cognition in geometric, boundary-based representations with scalable theoretical guarantees.

Abstract

The success of Large Language Models (LLMs), e.g., ChatGPT, is witnessed by their planetary popularity, their capability of human-like communication, and also by their steadily improved reasoning performance. However, it remains unclear whether LLMs reason. It is an open problem how traditional neural networks can be qualitatively extended to go beyond the statistic paradigm and achieve high-level cognition. Here, we present a novel qualitative extension by generalising computational building blocks from vectors to spheres. We propose Sphere Neural Networks (SphNNs) for human-like reasoning through model construction and inspection, and develop SphNN for syllogistic reasoning, a microcosm of human rationality. SphNN is a hierarchical neuro-symbolic Kolmogorov-Arnold geometric GNN, and uses a neuro-symbolic transition map of neighbourhood spatial relations to transform the current sphere configuration towards the target. SphNN is the first neural model that can determine the validity of long-chained syllogistic reasoning in one epoch without training data, with the worst computational complexity of O(N). SphNN can evolve into various types of reasoning, such as spatio-temporal reasoning, logical reasoning with negation and disjunction, event reasoning, neuro-symbolic unification, and humour understanding (the highest level of cognition). All these suggest a new kind of Herbert A. Simon's scissors with two neural blades. SphNNs will tremendously enhance interdisciplinary collaborations to develop the two neural blades and realise deterministic neural reasoning and human-bounded rationality and elevate LLMs to reliable psychological AI. This work suggests that the non-zero radii of spheres are the missing components that prevent traditional deep-learning systems from reaching the realm of rational reasoning and cause LLMs to be trapped in the swamp of hallucination.
Paper Structure (78 sections, 24 theorems, 16 equations, 30 figures, 12 tables, 4 algorithms)

This paper contains 78 sections, 24 theorems, 16 equations, 30 figures, 12 tables, 4 algorithms.

Key Result

Corollary 1

Each $\Delta$ function is linear concerning the radius and monotonic concerning the distance between the centres.

Figures (30)

  • Figure 1: (a) The geographical location of San Diego and Reno; (b) the region-based mental spatial representation explains why people mistakenly judge the spatial relation between San Diego and Reno; (c) two-step syllogistic reasoning to judge the relation between San Diego and Reno.
  • Figure 2: (a) a vector; (b) a closed umbrella; (c) an arc with its centre vector; (d) an open umbrella.
  • Figure 3: (a) the input of a traditional perceptron is a vector $\vec{x}=[x_1\dots x_n]$; (b) the input of a diameter-limited perceptron is restricted inside a sphere with the centre $\vec{O}$ and the radius $r$.
  • Figure 4: The inputs of the neural network are two spheres, $o_{11}\dots o_{1n}, r_1$ and $o_{21}\dots o_{2n}, r_2$, respectively, each is represented by its centre and its radius. The network computes the distance between their centres $dis = \sqrt{\sum_{i=1}^n (o_{1i}-o_{2i})^2}$. The output of the network is the value of $\max\{0, dis + r_1 - r_2\}$, which equals 0 when $\mathcal{O}_1$ is inside $\mathcal{O}_2$, and greater than 0, if not.
  • Figure 5: The Kolmogorov-Arnold neural architecture of $\Delta(\mathcal{O}_1,\mathcal{O}_2)\triangleq \max(0, \|\vec{O}_1-\vec{O}_2\| + r_1 - r_2)$. Given two $n+1$-dimensional vectors $\vec{x}$ and $\vec{y}$ representing $n$-dimensional spheres, the $f_r(\cdot)$ selects the $n+1^{th}$ element $x_{n+1}$ and $y_{n+1}$, and returns the radius $e^{x_{n+1}}$ and $e^{y_{n+1}}$ of $\mathcal{O}_1$ and $\mathcal{O}_2$, respectively; $f_{\vec{O}}(\cdot)$ selects the first $n$ elements as the centre of a sphere; $f_{\|\_\|}(\cdot)$ computes the Euclidean norm of a vector; the output of the first hidden layer is $\vec{O}_1-\vec{O}_2$; the output of the second hidden layer is $\|\vec{O}_1-\vec{O}_2\| + r_1 - r_2$; the final output is this network is zero, if $\mathcal{O}_1$ is inside $\mathcal{O}_2$, otherwise the output is greater than zero.
  • ...and 25 more figures

Theorems & Definitions (32)

  • Remark 1
  • Corollary 1
  • Theorem 1
  • Corollary 2
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5
  • Corollary 1
  • Theorem 1
  • ...and 22 more