Table of Contents
Fetching ...

MixFunn: A Neural Network for Differential Equations with Improved Generalization and Interpretability

Tiago de Souza Farias, Gubio Gomes de Lima, Jonas Maziero, Celso Jorge Villas-Boas

TL;DR

MixFunn introduces a physics-informed neural network architecture that combines mixed-function neurons and second-order interactions to solve differential equations with high accuracy, strong generalization, and interpretable analytical expressions. By using softmax-based function selection, dropout, and pruning, the method achieves drastic parameter reductions (up to four orders of magnitude vs. conventional PINNs) while maintaining or improving performance across classical, quantum, and fluid dynamics problems. Empirical results on the damped/forced harmonic oscillator, Burgers’ equation, and the quantum infinite well show improved extrapolation to unseen domains and the extraction of compact analytical forms, with hybrid second-order enhancements further boosting efficiency. The work suggests a compelling path toward efficient, interpretable, and generalizable solver architectures for complex physical systems.

Abstract

We introduce MixFunn, a novel neural network architecture designed to solve differential equations with enhanced precision, interpretability, and generalization capability. The architecture comprises two key components: the mixed-function neuron, which integrates multiple parameterized nonlinear functions to improve representational flexibility, and the second-order neuron, which combines a linear transformation of its inputs with a quadratic term to capture cross-combinations of input variables. These features significantly enhance the expressive power of the network, enabling it to achieve comparable or superior results with drastically fewer parameters and a reduction of up to four orders of magnitude compared to conventional approaches. We applied MixFunn in a physics-informed setting to solve differential equations in classical mechanics, quantum mechanics, and fluid dynamics, demonstrating its effectiveness in achieving higher accuracy and improved generalization to regions outside the training domain relative to standard machine learning models. Furthermore, the architecture facilitates the extraction of interpretable analytical expressions, offering valuable insights into the underlying solutions.

MixFunn: A Neural Network for Differential Equations with Improved Generalization and Interpretability

TL;DR

MixFunn introduces a physics-informed neural network architecture that combines mixed-function neurons and second-order interactions to solve differential equations with high accuracy, strong generalization, and interpretable analytical expressions. By using softmax-based function selection, dropout, and pruning, the method achieves drastic parameter reductions (up to four orders of magnitude vs. conventional PINNs) while maintaining or improving performance across classical, quantum, and fluid dynamics problems. Empirical results on the damped/forced harmonic oscillator, Burgers’ equation, and the quantum infinite well show improved extrapolation to unseen domains and the extraction of compact analytical forms, with hybrid second-order enhancements further boosting efficiency. The work suggests a compelling path toward efficient, interpretable, and generalizable solver architectures for complex physical systems.

Abstract

We introduce MixFunn, a novel neural network architecture designed to solve differential equations with enhanced precision, interpretability, and generalization capability. The architecture comprises two key components: the mixed-function neuron, which integrates multiple parameterized nonlinear functions to improve representational flexibility, and the second-order neuron, which combines a linear transformation of its inputs with a quadratic term to capture cross-combinations of input variables. These features significantly enhance the expressive power of the network, enabling it to achieve comparable or superior results with drastically fewer parameters and a reduction of up to four orders of magnitude compared to conventional approaches. We applied MixFunn in a physics-informed setting to solve differential equations in classical mechanics, quantum mechanics, and fluid dynamics, demonstrating its effectiveness in achieving higher accuracy and improved generalization to regions outside the training domain relative to standard machine learning models. Furthermore, the architecture facilitates the extraction of interpretable analytical expressions, offering valuable insights into the underlying solutions.

Paper Structure

This paper contains 17 sections, 15 equations, 12 figures, 2 tables.

Figures (12)

  • Figure 1: Schematic representation of a second-order neuron in a neural network. The input elements are cross-correlated up to second-order, then the cross-terms and the inputs are summed and acted on by a linear transformation.
  • Figure 2: Diagram of a mixed-function neuron in a neural network. This neuron utilizes a set of diverse nonlinear functions $\{f_i(s_i)\}^Q$, where $Q$ represents the number of functions (three in this figure). The output is a linear combination of these activation functions, given by $a=\sum_i^Q w_i f_i(s)$, with $w_i$ as the weights for each function $f_i(s_i)$ and $s_i$ as the input, which can be the data inputs, first-order (as shown in this figure) or second-order neurons. The selection of the activation functions is based on their differentiability, finiteness within the domain of interest, and the finiteness of their derivatives up to the highest order present in the differential equation being solved.
  • Figure 3: Comparison of the approximated solutions for the damped harmonic oscillator using the standard (a) PINN, (b) MixFunn, and (c) Mix2Funn models. The standard PINN fails to capture the dynamics accurately within the training domain and struggles with generalization in the test domain. In contrast, the Mix2Funn models perform significantly better, successfully approximating the oscillatory behavior even for large values of $t$. The shaded regions around the predicted solutions represent the variance due to five different model initializations.
  • Figure 4: Relationship between generalization error and the maximum value of the training domain $T_{max}$. The generalization error decreases as $T_{max}$ increases, indicating improved generalization capability. The shaded region indicates the variation in the average error due to different parameter initializations of the network.
  • Figure 5: Residual error as a function of the pruning ratio for the Mix2Funn model. The figure demonstrates that increasing the pruning ratio reduces the residual error, with the lowest error achieved at the highest pruning ratio, where all but one parameter are pruned.
  • ...and 7 more figures