Table of Contents
Fetching ...

Univariate Skeleton Prediction in Multivariate Systems Using Transformers

Giorgio Morales, John W. Sheppard

TL;DR

An explainable neural SR method is proposed that generates univariate symbolic skeletons that aim to explain how each variable influences the system's response and outperforms two GP-based and two neural SR methods.

Abstract

Symbolic regression (SR) methods attempt to learn mathematical expressions that approximate the behavior of an observed system. However, when dealing with multivariate systems, they often fail to identify the functional form that explains the relationship between each variable and the system's response. To begin to address this, we propose an explainable neural SR method that generates univariate symbolic skeletons that aim to explain how each variable influences the system's response. By analyzing multiple sets of data generated artificially, where one input variable varies while others are fixed, relationships are modeled separately for each input variable. The response of such artificial data sets is estimated using a regression neural network (NN). Finally, the multiple sets of input-response pairs are processed by a pre-trained Multi-Set Transformer that solves a problem we termed Multi-Set Skeleton Prediction and outputs a univariate symbolic skeleton. Thus, such skeletons represent explanations of the function approximated by the regression NN. Experimental results demonstrate that this method learns skeleton expressions matching the underlying functions and outperforms two GP-based and two neural SR methods.

Univariate Skeleton Prediction in Multivariate Systems Using Transformers

TL;DR

An explainable neural SR method is proposed that generates univariate symbolic skeletons that aim to explain how each variable influences the system's response and outperforms two GP-based and two neural SR methods.

Abstract

Symbolic regression (SR) methods attempt to learn mathematical expressions that approximate the behavior of an observed system. However, when dealing with multivariate systems, they often fail to identify the functional form that explains the relationship between each variable and the system's response. To begin to address this, we propose an explainable neural SR method that generates univariate symbolic skeletons that aim to explain how each variable influences the system's response. By analyzing multiple sets of data generated artificially, where one input variable varies while others are fixed, relationships are modeled separately for each input variable. The response of such artificial data sets is estimated using a regression neural network (NN). Finally, the multiple sets of input-response pairs are processed by a pre-trained Multi-Set Transformer that solves a problem we termed Multi-Set Skeleton Prediction and outputs a univariate symbolic skeleton. Thus, such skeletons represent explanations of the function approximated by the regression NN. Experimental results demonstrate that this method learns skeleton expressions matching the underlying functions and outperforms two GP-based and two neural SR methods.

Paper Structure

This paper contains 20 sections, 4 equations, 5 figures, 8 tables, 2 algorithms.

Figures (5)

  • Figure 1.3.1: $x_1$ vs. $y$ curves when $x_2=4.45$, $0.2$, and $1.13$.
  • Figure 1.4.1: MZs for Field A and corresponding N-response curves MZ.
  • Figure A.1: An example of an MSSP problem using the Multi-set Transformer.
  • Figure B.1: Example of a randomly generated expression.
  • Figure B.2: Generation data for $f(x) = \frac{-3.12 x}{\texttt{sin}(1.45 x)} - 2.2$. (Left) Generated data on the entire domain. (Right) Detailed view of how singularities are avoided.

Theorems & Definitions (2)

  • definition thmcounterdefinition
  • definition thmcounterdefinition