Table of Contents
Fetching ...

Artifical intelligence and inherent mathematical difficulty

Walter Dean, Alberto Naibo

TL;DR

The paper investigates whether artificial intelligence can alleviate the fundamental challenge of proof discovery in mathematics. It formalizes proof discovery as the decision problem Prov^S_Γ and shows that, due to Gödel–incompleteness and arithmetical hierarchy results, this task remains inherently difficult and often intractable in practice. Through case studies using automated theorem proving, SAT solvers, and large language models, it demonstrates that current AI methods operate as brute-force search on low-complexity (Σ^0_1) statements, yielding successes that are impressive but not representative of resolving high-profile open questions. The findings suggest AI will contribute incrementally and selectively, while the core difficulty of resolving open questions persists, prompting reflection on natural-case complexity and human–computer collaboration in mathematical practice.

Abstract

This paper explores the relationship of artificial intelligence to the task of resolving open questions in mathematics. We first present an updated version of a traditional argument that limitative results from computability and complexity theory show that proof discovery is an inherently difficult problem. We then illustrate how several recent applications of artificial intelligence-inspired methods -- respectively involving automated theorem proving, SAT-solvers, and large language models -- do indeed raise novel questions about the nature of mathematical proof. We also argue that the results obtained by such techniques do not tell against our basic argument. This is so because they are embodiments of brute force search and are thus capable of deciding only statements of low logical complexity.

Artifical intelligence and inherent mathematical difficulty

TL;DR

The paper investigates whether artificial intelligence can alleviate the fundamental challenge of proof discovery in mathematics. It formalizes proof discovery as the decision problem Prov^S_Γ and shows that, due to Gödel–incompleteness and arithmetical hierarchy results, this task remains inherently difficult and often intractable in practice. Through case studies using automated theorem proving, SAT solvers, and large language models, it demonstrates that current AI methods operate as brute-force search on low-complexity (Σ^0_1) statements, yielding successes that are impressive but not representative of resolving high-profile open questions. The findings suggest AI will contribute incrementally and selectively, while the core difficulty of resolving open questions persists, prompting reflection on natural-case complexity and human–computer collaboration in mathematical practice.

Abstract

This paper explores the relationship of artificial intelligence to the task of resolving open questions in mathematics. We first present an updated version of a traditional argument that limitative results from computability and complexity theory show that proof discovery is an inherently difficult problem. We then illustrate how several recent applications of artificial intelligence-inspired methods -- respectively involving automated theorem proving, SAT-solvers, and large language models -- do indeed raise novel questions about the nature of mathematical proof. We also argue that the results obtained by such techniques do not tell against our basic argument. This is so because they are embodiments of brute force search and are thus capable of deciding only statements of low logical complexity.
Paper Structure (14 sections, 3 theorems)

This paper contains 14 sections, 3 theorems.

Key Result

Theorem 1

Suppose that a)$\vdash_S$ is a computably enumerable derivability relation which extends that of classical first-order logic $\mathrm{[FOL]}$ and b)$\Gamma$ is a consistent, computably axiomatizable theory that interprets $\mathsf{Q}$$($i.e. Robinson arithmetic$)$ and which is additionally $\Sigma^0

Theorems & Definitions (3)

  • Theorem 1
  • Theorem 2
  • Theorem 3