Table of Contents
Fetching ...

Multi-task GINN-LP for Multi-target Symbolic Regression

Hussein Rajabu, Lijun Qian, Xishuang Dong

TL;DR

This work tackles the limitations of symbolic regression by enabling multi-target, interpretable regression through a neuro-symbolic framework named MTRGINN-LP. The method fuses a Laurent-polynomial–based backbone (GINN-LP) with multi-task learning, and introduces a symbolic loss that enforces consistency between neural predictions and explicit symbolic expressions. Experiments on Energy Efficiency and Sustainable Agriculture demonstrate competitive predictive performance while preserving interpretability, highlighting the approach's potential to bridge symbolic regression and real-world multi-output tasks. Overall, MTRGINN-LP advances explainable AI for complex domains by delivering transparent equations alongside robust multi-target modeling, with clear directions for expanding function families and PAB architectures in future work.

Abstract

In the area of explainable artificial intelligence, Symbolic Regression (SR) has emerged as a promising approach by discovering interpretable mathematical expressions that fit data. However, SR faces two main challenges: most methods are evaluated on scientific datasets with well-understood relationships, limiting generalization, and SR primarily targets single-output regression, whereas many real-world problems involve multi-target outputs with interdependent variables. To address these issues, we propose multi-task regression GINN-LP (MTRGINN-LP), an interpretable neural network for multi-target symbolic regression. By integrating GINN-LP with a multi-task deep learning, the model combines a shared backbone including multiple power-term approximator blocks with task-specific output layers, capturing inter-target dependencies while preserving interpretability. We validate multi-task GINN-LP on practical multi-target applications, including energy efficiency prediction and sustainable agriculture. Experimental results demonstrate competitive predictive performance alongside high interpretability, effectively extending symbolic regression to broader real-world multi-output tasks.

Multi-task GINN-LP for Multi-target Symbolic Regression

TL;DR

This work tackles the limitations of symbolic regression by enabling multi-target, interpretable regression through a neuro-symbolic framework named MTRGINN-LP. The method fuses a Laurent-polynomial–based backbone (GINN-LP) with multi-task learning, and introduces a symbolic loss that enforces consistency between neural predictions and explicit symbolic expressions. Experiments on Energy Efficiency and Sustainable Agriculture demonstrate competitive predictive performance while preserving interpretability, highlighting the approach's potential to bridge symbolic regression and real-world multi-output tasks. Overall, MTRGINN-LP advances explainable AI for complex domains by delivering transparent equations alongside robust multi-target modeling, with clear directions for expanding function families and PAB architectures in future work.

Abstract

In the area of explainable artificial intelligence, Symbolic Regression (SR) has emerged as a promising approach by discovering interpretable mathematical expressions that fit data. However, SR faces two main challenges: most methods are evaluated on scientific datasets with well-understood relationships, limiting generalization, and SR primarily targets single-output regression, whereas many real-world problems involve multi-target outputs with interdependent variables. To address these issues, we propose multi-task regression GINN-LP (MTRGINN-LP), an interpretable neural network for multi-target symbolic regression. By integrating GINN-LP with a multi-task deep learning, the model combines a shared backbone including multiple power-term approximator blocks with task-specific output layers, capturing inter-target dependencies while preserving interpretability. We validate multi-task GINN-LP on practical multi-target applications, including energy efficiency prediction and sustainable agriculture. Experimental results demonstrate competitive predictive performance alongside high interpretability, effectively extending symbolic regression to broader real-world multi-output tasks.

Paper Structure

This paper contains 17 sections, 10 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Diagram of the multi-task regression GINN-LP (MTRGINN-LP). Multiple Polynomial Approximation Blocks (PABs) are initialized and progressively expanded during training. The model then produces multi-task predictions $\hat{y}$ by applying linear activation functions $f_{\text{act}}(\cdot)$ to independently weighted sums of polynomial terms from the PABs. Finally, based on the predictions $\hat{y}$ and ground truth $\mathbf{y}$, the model is optimized by minimizing the total loss $\mathcal{L}$.
  • Figure 2: Symbolic functions learned by the proposal model for the Energy Efficiency datasets. For each target, the function $f(\cdot)$ consists of eight terms ($T_1$ to $T_8$) and one bias term, corresponding to the optimal number 8 of PABs for the proposed model, where $f_{Y_1}$ and $f_{Y_2}$ are for estimating two targets: Heating Load and Cooling Load, respectively, respectively. Each term is a combination of eight input attributes ($x_1$ to $x_8$) with exponential form.
  • Figure 3: An example of correlation analysis between inputs and outputs in the symbolic functions. It is assumed that the input attributes are independent of one another, and the output includes only the target Heating Load denoted as $Y$.
  • Figure 4: Hyperparameter effects on performance for the Energy Efficiency datasets. It examines how different configurations of the maximum and initial numbers of PABs influence the multi-task regression performance in terms of MAE, MAPE, and RMSE. The initial number of PABs is set to 1, 2, or 3, while the maximum number of PABs is set to 2, 4, 6, 8, and 10.
  • Figure 5: Symbolic functions learned by the proposal model for the Sustainable Agriculture datasets. For each target, the function $f(\cdot)$ consists of six terms ($T_1$ to $T_6$) and one bias term, corresponding to the optimal number 6 of PABs for the proposed model, where $f_{Y_1}$ and $f_{Y_2}$ are for estimating two targets: Sustainability Score and Consumer Trend Index, respectively. Each term is a combination of fifteen input attributes ($x_1$ to $x_{15}$) with exponential form.
  • ...and 2 more figures