Table of Contents
Fetching ...

Operator Feature Neural Network for Symbolic Regression

Yusong Deng, Min Wu, Lina Yu, Jingyi Liu, Shu Wei, Yanjie Li, Weijun Li

TL;DR

OF-Net introduces operator-aware representations for symbolic regression by encoding mathematical operator logic into feature spaces and substituting operator features for numeric loss. It represents expressions as directed graphs and uses a DeepONet-inspired architecture with forward fitting and backward inference to learn operator features, guided by a constrained search over an operator adjacency graph. Experimental results on public datasets show OF-Net achieves high skeleton-recovery rates and competitive $R^2$ scores, with particularly strong performance on univariate expressions and notable challenges in bivariable cases due to data and operator-overlap issues. The work suggests future improvements in dataset diversity, operator-set design, and robust constant optimization to enhance extrapolation and scalability.

Abstract

Symbolic regression is a task aimed at identifying patterns in data and representing them through mathematical expressions, generally involving skeleton prediction and constant optimization. Many methods have achieved some success, however they treat variables and symbols merely as characters of natural language without considering their mathematical essence. This paper introduces the operator feature neural network (OF-Net) which employs operator representation for expressions and proposes an implicit feature encoding method for the intrinsic mathematical operational logic of operators. By substituting operator features for numeric loss, we can predict the combination of operators of target expressions. We evaluate the model on public datasets, and the results demonstrate that the model achieves superior recovery rates and high $R^2$ scores. With the discussion of the results, we analyze the merit and demerit of OF-Net and propose optimizing schemes.

Operator Feature Neural Network for Symbolic Regression

TL;DR

OF-Net introduces operator-aware representations for symbolic regression by encoding mathematical operator logic into feature spaces and substituting operator features for numeric loss. It represents expressions as directed graphs and uses a DeepONet-inspired architecture with forward fitting and backward inference to learn operator features, guided by a constrained search over an operator adjacency graph. Experimental results on public datasets show OF-Net achieves high skeleton-recovery rates and competitive scores, with particularly strong performance on univariate expressions and notable challenges in bivariable cases due to data and operator-overlap issues. The work suggests future improvements in dataset diversity, operator-set design, and robust constant optimization to enhance extrapolation and scalability.

Abstract

Symbolic regression is a task aimed at identifying patterns in data and representing them through mathematical expressions, generally involving skeleton prediction and constant optimization. Many methods have achieved some success, however they treat variables and symbols merely as characters of natural language without considering their mathematical essence. This paper introduces the operator feature neural network (OF-Net) which employs operator representation for expressions and proposes an implicit feature encoding method for the intrinsic mathematical operational logic of operators. By substituting operator features for numeric loss, we can predict the combination of operators of target expressions. We evaluate the model on public datasets, and the results demonstrate that the model achieves superior recovery rates and high scores. With the discussion of the results, we analyze the merit and demerit of OF-Net and propose optimizing schemes.
Paper Structure (13 sections, 1 equation, 4 figures, 5 tables)

This paper contains 13 sections, 1 equation, 4 figures, 5 tables.

Figures (4)

  • Figure 1: A example for tree and graph. The expression $y = sin(x_1) + cos(x_2) + x_1$ has multiple preorder encoding for tree but unique for graph.
  • Figure 2: The structure of OF-Net. The forward process of operator fitting is shown by hollow arrows and the linear arrows represents for the backward inference.
  • Figure 3: Statistical result of the experiment. A: performance on holistic data. B: performance on univariate data. C: performance on bivariate data. D: $R^2$ on different datasets. E: recovery rate on different datasets
  • Figure 4: Performance on different length