Learning to Compute Gröbner Bases

Hiroshi Kera; Yuki Ishihara; Yuta Kambe; Tristan Vaccon; Kazuhiro Yokoyama

Learning to Compute Gröbner Bases

Hiroshi Kera, Yuki Ishihara, Yuta Kambe, Tristan Vaccon, Kazuhiro Yokoyama

TL;DR

The experiments show that the dataset generation method is a few orders of magnitude faster than a naive approach, overcoming a crucial challenge in learning to compute Gr\"obner bases, and Gr\"obner computation is learnable in a particular class.

Abstract

Solving a polynomial system, or computing an associated Gröbner basis, has been a fundamental task in computational algebra. However, it is also known for its notorious doubly exponential time complexity in the number of variables in the worst case. This paper is the first to address the learning of Gröbner basis computation with Transformers. The training requires many pairs of a polynomial system and the associated Gröbner basis, raising two novel algebraic problems: random generation of Gröbner bases and transforming them into non-Gröbner ones, termed as backward Gröbner problem. We resolve these problems with 0-dimensional radical ideals, the ideals appearing in various applications. Further, we propose a hybrid input embedding to handle coefficient tokens with continuity bias and avoid the growth of the vocabulary set. The experiments show that our dataset generation method is a few orders of magnitude faster than a naive approach, overcoming a crucial challenge in learning to compute Gröbner bases, and Gröbner computation is learnable in a particular class.

Learning to Compute Gröbner Bases

TL;DR

Abstract

Paper Structure (46 sections, 7 theorems, 12 equations, 1 figure, 25 tables, 1 algorithm)

This paper contains 46 sections, 7 theorems, 12 equations, 1 figure, 25 tables, 1 algorithm.

Introduction
Related Work
Gröbner basis computation.
Transformers for mathematics.
Notations and Definitions
Intuition of Gröbner bases and system solving.
Other notations.
New Algebraic Problems
Scope of this study
Random generation of Gröbner bases
Backward Gröbner problem
Dataset generation algorithm
Hybrid Input Embedding
Experiments
Dataset generation
...and 31 more sections

Key Result

Theorem 4.6

Let $G = (g_1,\ldots, g_t)^{\top}$ be a Gröbner basis of a 0-dimensional ideal in $k[x_1,\ldots, x_n]$. Let $F = (f_1,\ldots, f_s)^{\top} = AG$ with $A \in k[x_1,\ldots, x_n]^{s\times t}$.

Figures (1)

Figure 1: Visual analysis of embedding vectors of numbers given by the proposed embedding. Embedding $c\in {\mathbb{R}}$ to $f_{\mathrm{E}}(c) \in {\mathbb{R}}^{D}$ from $c_{\min}$ to $c_{\max}$ with $B$ bins to obtain $M \in {\mathbb{R}}^{B\times D}$, the fix figures show from the left, (i) the Euclidean distance matrix of $M$, (ii) its slice at $0$, (iii) the norm of embedding vectors, (iv) the dot product $\tilde{M}\tilde{M}^{\top}$ with $\tilde{M}$ of the row-normalized $M$, (v) $f_{\mathrm{E}}(0)^{\top}\tilde{M}$ and (vi) $f_{\mathrm{E}}(c_0)^{\top}M$. (a) Trained on ${\mathbb{R}}[x_1, x_2]$; $(c_{\min}, c_{\max}) = (-100, 100)$. (b) Trained on ${\mathbb{F}}_{31}[x_1, x_2]$; $(c_{\min}, c_{\max}) = (0, 31)$. The embedding layer $f_{\mathrm{E}}$ has one/two hidden layers (top/bottom rows). As can be seen, the relationship between embedding vectors in terms of distance and dot product is aligned well in the infinite field and not in the finite field.

Theorems & Definitions (30)

Definition 3.1: Leading term
Definition 3.2: Gröbner basis
Definition 4.3: 0-dimensional ideal
Definition 4.4: Shape position
Theorem 4.6
Proposition 4.6
Theorem 4.7
Definition A.1: Ring, Field (atiyah1994introduction, Chap. 1 §1)
Definition A.2: Polynomial Ring (atiyah1994introduction, Chap. 1 §1)
Definition A.3: Quotient Ring (atiyah1994introduction, Chap. 1 §1)
...and 20 more

Learning to Compute Gröbner Bases

TL;DR

Abstract

Learning to Compute Gröbner Bases

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (30)