Table of Contents
Fetching ...

AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs

Mert Ünsal, Timon Gehr, Martin Vechev

TL;DR

This work introduces a symbolic engine with axiomatically correct actions on mathematical expressions, as well as the first dataset for step-by-step integration, and presents the first correct-by-construction learning-based system for step-by-step mathematical integration.

Abstract

We present the first correct-by-construction learning-based system for step-by-step mathematical integration. The key idea is to learn a policy, represented by a GPT transformer model, which guides the search for the right mathematical integration rule, to be carried out by a symbolic solver. Concretely, we introduce a symbolic engine with axiomatically correct actions on mathematical expressions, as well as the first dataset for step-by-step integration. Our GPT-style transformer model, trained on this synthetic data, demonstrates strong generalization by surpassing its own data generator in accuracy and efficiency, using 50% fewer search steps. Our experimental results with SoTA LLMs also demonstrate that the standard approach of fine-tuning LLMs on a set of question-answer pairs is insufficient for solving this mathematical task. This motivates the importance of discovering creative methods for combining LLMs with symbolic reasoning engines, of which our work is an instance.

AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs

TL;DR

This work introduces a symbolic engine with axiomatically correct actions on mathematical expressions, as well as the first dataset for step-by-step integration, and presents the first correct-by-construction learning-based system for step-by-step mathematical integration.

Abstract

We present the first correct-by-construction learning-based system for step-by-step mathematical integration. The key idea is to learn a policy, represented by a GPT transformer model, which guides the search for the right mathematical integration rule, to be carried out by a symbolic solver. Concretely, we introduce a symbolic engine with axiomatically correct actions on mathematical expressions, as well as the first dataset for step-by-step integration. Our GPT-style transformer model, trained on this synthetic data, demonstrates strong generalization by surpassing its own data generator in accuracy and efficiency, using 50% fewer search steps. Our experimental results with SoTA LLMs also demonstrate that the standard approach of fine-tuning LLMs on a set of question-answer pairs is insufficient for solving this mathematical task. This motivates the importance of discovering creative methods for combining LLMs with symbolic reasoning engines, of which our work is an instance.
Paper Structure (35 sections, 2 equations, 7 figures, 3 tables, 1 algorithm)

This paper contains 35 sections, 2 equations, 7 figures, 3 tables, 1 algorithm.

Figures (7)

  • Figure 1: The symbolic engine takes in the subexpression $\int x(4x + \log x - 2) dx$ and applies the substitution rule with $y = \log x$. The symbolic engine will realize the rule by first differentiating $y(x)$ and dividing the integrand by $y'$. Then, it will substitute by $y$ wherever it observes $y(x)$ and then solve for $x = g(y)$ to substitute the remaining terms.
  • Figure 2: Tree representations of the expressions $\frac{1}{x+3} + 2\cosh^2(x)$ and $\int x^2 e^x dx$
  • Figure 3: Inference Loop for Integration
  • Figure 4: Step-by-step solution generated by AlphaIntegrator.
  • Figure 5: Distribution of number of nodes explored in tree search for AlphaIntegrator vs SymPy.
  • ...and 2 more figures