Table of Contents
Fetching ...

Learning advanced mathematical computations from examples

François Charton, Amaury Hayat, Guillaume Lample

TL;DR

The paper demonstrates that transformers trained on expansive synthetic datasets can predict both qualitative and numerical properties of advanced mathematical problems without embedding mathematical knowledge. By targeting local stability, controllability, and PDE behavior, it shows near-perfect qualitative accuracy and strong numerical performance, often surpassing simple baselines and generalizing to longer expressions and new problem distributions. The work highlights the potential of neural sequence models to learn symbolic-numeric computation, while acknowledging that learned solutions may rely on shortcuts rather than explicit mathematical reasoning. These findings open avenues for fast, parallelizable alternatives to traditional solvers and motivate further investigation into interpretability and the limits of such learned mathematical reasoning.

Abstract

Using transformers over large generated datasets, we train models to learn mathematical properties of differential systems, such as local stability, behavior at infinity and controllability. We achieve near perfect prediction of qualitative characteristics, and good approximations of numerical features of the system. This demonstrates that neural networks can learn to perform complex computations, grounded in advanced theory, from examples, without built-in mathematical knowledge.

Learning advanced mathematical computations from examples

TL;DR

The paper demonstrates that transformers trained on expansive synthetic datasets can predict both qualitative and numerical properties of advanced mathematical problems without embedding mathematical knowledge. By targeting local stability, controllability, and PDE behavior, it shows near-perfect qualitative accuracy and strong numerical performance, often surpassing simple baselines and generalizing to longer expressions and new problem distributions. The work highlights the potential of neural sequence models to learn symbolic-numeric computation, while acknowledging that learned solutions may rely on shortcuts rather than explicit mathematical reasoning. These findings open avenues for fast, parallelizable alternatives to traditional solvers and motivate further investigation into interpretability and the limits of such learned mathematical reasoning.

Abstract

Using transformers over large generated datasets, we train models to learn mathematical properties of differential systems, such as local stability, behavior at infinity and controllability. We achieve near perfect prediction of qualitative characteristics, and good approximations of numerical features of the system. This demonstrates that neural networks can learn to perform complex computations, grounded in advanced theory, from examples, without built-in mathematical knowledge.

Paper Structure

This paper contains 45 sections, 3 theorems, 39 equations, 1 figure, 10 tables.

Key Result

Theorem 3.1

Let $J(f)(x_{e})$ be the Jacobian matrix of $f$ in $x_{e}$ (the matrix of its partial derivatives relative to its variables). Let $\lambda$ be the largest real part of its eigenvalues. If $\lambda$ is positive, $x_{e}$ is an unstable equilibrium. If $\lambda$ is negative, then $x_{e}$ is a locally s

Figures (1)

  • Figure 1: End to end stability accuracy vs number of training examples. 12 models, trained over shuffled versions of the same dataset.

Theorems & Definitions (8)

  • Theorem 3.1
  • Theorem 3.2: Kalman condition
  • Proposition 3.1
  • Definition B.1
  • Definition B.2
  • Definition B.3
  • Definition B.4
  • Definition B.5