Table of Contents
Fetching ...

GINN-LP: A Growing Interpretable Neural Network for Discovering Multivariate Laurent Polynomial Equations

Nisal Ranasinghe, Damith Senanayake, Sachith Seneviratne, Malin Premaratne, Saman Halgamuge

TL;DR

This work tackles interpretable equation discovery by focusing on multivariate Laurent polynomials (LPs). It introduces GINN-LP, an end-to-end differentiable neural network built from power-term approximator (PTA) blocks that realize LP terms as $x_1^{w_1} x_2^{w_2} \dots x_k^{w_k}$; a growth strategy automatically determines the number of LP terms, aided by sparsity regularization. The method achieves state-of-the-art results on the SRBench Feynman LP datasets and can be extended via an ensemble with high-performing SR methods to handle non-LP equations, yielding improved solution rates and $R^2$ performance. A key limitation is the requirement for positive inputs due to the log-based PTA activations, with future work proposed to address negative values and to extend beyond LP through deeper PTA stacking. Overall, GINN-LP provides a scalable, interpretable path to exact equation discovery and reliable model selection in scientific data analysis.

Abstract

Traditional machine learning is generally treated as a black-box optimization problem and does not typically produce interpretable functions that connect inputs and outputs. However, the ability to discover such interpretable functions is desirable. In this work, we propose GINN-LP, an interpretable neural network to discover the form and coefficients of the underlying equation of a dataset, when the equation is assumed to take the form of a multivariate Laurent Polynomial. This is facilitated by a new type of interpretable neural network block, named the "power-term approximator block", consisting of logarithmic and exponential activation functions. GINN-LP is end-to-end differentiable, making it possible to use backpropagation for training. We propose a neural network growth strategy that will enable finding the suitable number of terms in the Laurent polynomial that represents the data, along with sparsity regularization to promote the discovery of concise equations. To the best of our knowledge, this is the first model that can discover arbitrary multivariate Laurent polynomial terms without any prior information on the order. Our approach is first evaluated on a subset of data used in SRBench, a benchmark for symbolic regression. We first show that GINN-LP outperforms the state-of-the-art symbolic regression methods on datasets generated using 48 real-world equations in the form of multivariate Laurent polynomials. Next, we propose an ensemble method that combines our method with a high-performing symbolic regression method, enabling us to discover non-Laurent polynomial equations. We achieve state-of-the-art results in equation discovery, showing an absolute improvement of 7.1% over the best contender, by applying this ensemble method to 113 datasets within SRBench with known ground-truth equations.

GINN-LP: A Growing Interpretable Neural Network for Discovering Multivariate Laurent Polynomial Equations

TL;DR

This work tackles interpretable equation discovery by focusing on multivariate Laurent polynomials (LPs). It introduces GINN-LP, an end-to-end differentiable neural network built from power-term approximator (PTA) blocks that realize LP terms as ; a growth strategy automatically determines the number of LP terms, aided by sparsity regularization. The method achieves state-of-the-art results on the SRBench Feynman LP datasets and can be extended via an ensemble with high-performing SR methods to handle non-LP equations, yielding improved solution rates and performance. A key limitation is the requirement for positive inputs due to the log-based PTA activations, with future work proposed to address negative values and to extend beyond LP through deeper PTA stacking. Overall, GINN-LP provides a scalable, interpretable path to exact equation discovery and reliable model selection in scientific data analysis.

Abstract

Traditional machine learning is generally treated as a black-box optimization problem and does not typically produce interpretable functions that connect inputs and outputs. However, the ability to discover such interpretable functions is desirable. In this work, we propose GINN-LP, an interpretable neural network to discover the form and coefficients of the underlying equation of a dataset, when the equation is assumed to take the form of a multivariate Laurent Polynomial. This is facilitated by a new type of interpretable neural network block, named the "power-term approximator block", consisting of logarithmic and exponential activation functions. GINN-LP is end-to-end differentiable, making it possible to use backpropagation for training. We propose a neural network growth strategy that will enable finding the suitable number of terms in the Laurent polynomial that represents the data, along with sparsity regularization to promote the discovery of concise equations. To the best of our knowledge, this is the first model that can discover arbitrary multivariate Laurent polynomial terms without any prior information on the order. Our approach is first evaluated on a subset of data used in SRBench, a benchmark for symbolic regression. We first show that GINN-LP outperforms the state-of-the-art symbolic regression methods on datasets generated using 48 real-world equations in the form of multivariate Laurent polynomials. Next, we propose an ensemble method that combines our method with a high-performing symbolic regression method, enabling us to discover non-Laurent polynomial equations. We achieve state-of-the-art results in equation discovery, showing an absolute improvement of 7.1% over the best contender, by applying this ensemble method to 113 datasets within SRBench with known ground-truth equations.
Paper Structure (19 sections, 7 equations, 12 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 7 equations, 12 figures, 4 tables, 1 algorithm.

Figures (12)

  • Figure 1: The architecture of the proposed interpretable NN block named the "PTA" block. This block can discover a single term in a multivariate LP. $x_1, x_2,...,x_n$ are the inputs to the block and $w_1, w_2,...,w_n$ are weights of the linear activated neuron.
  • Figure 2: The architecture of the proposed interpretable neural network, GINN-LP. This consists of multiple PTA blocks in parallel, each discovering a single term in the underlying multivariate LP.
  • Figure 3: The ensemble pipeline combines GINN-LP with another high-performing SR method, enabling it to discover both LP and non-LP equations
  • Figure 4: Solution rate of the top five algorithms, for all datasets with LP ground-truths. The median solution rates are shown on the side of each plot.
  • Figure 5: (a) Comparison of SR methods, with the presence of target noise. The mean solution rates are reported. (b) Comparison of SR methods with respect to the ground truth equation complexity. The mean solution rate is reported
  • ...and 7 more figures