Table of Contents
Fetching ...

Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning

Yanjie Li, Weijun Li, Lina Yu, Min Wu, Jingyi Liu, Wenqiang Li, Meilan Hao, Shu Wei, Yusong Deng

TL;DR

Symbolic regression remains challenging due to trade-offs between search efficiency, robustness, and generalization. FormulaGPT distills offline RL histories of SR methods into a Transformer, using a SetTransformer encoder to condition on data and autoregressively generate SR training sequences, including constant optimization, with a speed-up via a shortcut dataset. The approach achieves competitive or state-of-the-art fitting on multiple SR benchmarks, while offering improved noise robustness, versatility, and inference efficiency compared to prior pre-trained models. This work essentially bridges RL-based SR and large-scale pre-training, enabling practical, context-aware policy updates for new data and paving the way for robust symbolic discovery in noisy real-world settings.

Abstract

The mathematical formula is the human language to describe nature and is the essence of scientific research. Finding mathematical formulas from observational data is a major demand of scientific research and a major challenge of artificial intelligence. This area is called symbolic regression. Originally symbolic regression was often formulated as a combinatorial optimization problem and solved using GP or reinforcement learning algorithms. These two kinds of algorithms have strong noise robustness ability and good Versatility. However, inference time usually takes a long time, so the search efficiency is relatively low. Later, based on large-scale pre-training data proposed, such methods use a large number of synthetic data points and expression pairs to train a Generative Pre-Trained Transformer(GPT). Then this GPT can only need to perform one forward propagation to obtain the results, the advantage is that the inference speed is very fast. However, its performance is very dependent on the training data and performs poorly on data outside the training set, which leads to poor noise robustness and Versatility of such methods. So, can we combine the advantages of the above two categories of SR algorithms? In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data. After training, the SR algorithm based on reinforcement learning is distilled into a Transformer. When new test data comes, FormulaGPT can directly generate a "reinforcement learning process" and automatically update the learning policy in context. Tested on more than ten datasets including SRBench, formulaGPT achieves the state-of-the-art performance in fitting ability compared with four baselines. In addition, it achieves satisfactory results in noise robustness, versatility, and inference efficiency.

Generative Pre-Trained Transformer for Symbolic Regression Base In-Context Reinforcement Learning

TL;DR

Symbolic regression remains challenging due to trade-offs between search efficiency, robustness, and generalization. FormulaGPT distills offline RL histories of SR methods into a Transformer, using a SetTransformer encoder to condition on data and autoregressively generate SR training sequences, including constant optimization, with a speed-up via a shortcut dataset. The approach achieves competitive or state-of-the-art fitting on multiple SR benchmarks, while offering improved noise robustness, versatility, and inference efficiency compared to prior pre-trained models. This work essentially bridges RL-based SR and large-scale pre-training, enabling practical, context-aware policy updates for new data and paving the way for robust symbolic discovery in noisy real-world settings.

Abstract

The mathematical formula is the human language to describe nature and is the essence of scientific research. Finding mathematical formulas from observational data is a major demand of scientific research and a major challenge of artificial intelligence. This area is called symbolic regression. Originally symbolic regression was often formulated as a combinatorial optimization problem and solved using GP or reinforcement learning algorithms. These two kinds of algorithms have strong noise robustness ability and good Versatility. However, inference time usually takes a long time, so the search efficiency is relatively low. Later, based on large-scale pre-training data proposed, such methods use a large number of synthetic data points and expression pairs to train a Generative Pre-Trained Transformer(GPT). Then this GPT can only need to perform one forward propagation to obtain the results, the advantage is that the inference speed is very fast. However, its performance is very dependent on the training data and performs poorly on data outside the training set, which leads to poor noise robustness and Versatility of such methods. So, can we combine the advantages of the above two categories of SR algorithms? In this paper, we propose \textbf{FormulaGPT}, which trains a GPT using massive sparse reward learning histories of reinforcement learning-based SR algorithms as training data. After training, the SR algorithm based on reinforcement learning is distilled into a Transformer. When new test data comes, FormulaGPT can directly generate a "reinforcement learning process" and automatically update the learning policy in context. Tested on more than ten datasets including SRBench, formulaGPT achieves the state-of-the-art performance in fitting ability compared with four baselines. In addition, it achieves satisfactory results in noise robustness, versatility, and inference efficiency.
Paper Structure (34 sections, 5 figures, 10 tables, 1 algorithm)

This paper contains 34 sections, 5 figures, 10 tables, 1 algorithm.

Figures (5)

  • Figure 1: Figure (a) shows the training process of FormulaGPT; Figure (b) shows the inference process of FormulaGPT.
  • Figure 2: Figure (a) shows the noise robustness ability of the five methods. From the figure, we can see that although the noise robustness of FormulaGPT is not as good as that of DSO, it is better than that of SNIP and NeSymReS. Figure (b) shows the $R^2$-time scatter plot. From the figure, we can see that the inference time of FormulaGPT is much lower than that of DSO and SPL, and slightly slower than that of SNIP and NeSymReS. In particular, the closer the center point of each algorithm is to the bottom right corner, the better the comprehensive performance of the algorithm is. FormulaGPT is one of the algorithms closest to the bottom right.
  • Figure 3: This figure shows the change of $R^2$ over time when FormulaGPT searches some expressions. From the figure, we can see that although $R^2$ has oscillation, the overall $R^2$ still shows an upward trend.
  • Figure 4: This figure shows the Versatility test of five algorithms. From the variation trend of $R^2$ in the figure, we can see that although FormulaGPT is not as general as DSO and SPL, it is far better than SNIP and NeSymReS two pre-training methods. We achieved what we expected.
  • Figure 5: From the figures above, we can see that the performance of the algorithm does not improve much when increasing the data size from 10k to 50k. However, when the data size is from 50k to 500k, the performance of the algorithm has been greatly improved, because the data size is already considerable. Then, with the further increase of data size, the performance of the algorithm also shows a steady upward trend, but the upward speed slows down.