MixFunn: A Neural Network for Differential Equations with Improved Generalization and Interpretability
Tiago de Souza Farias, Gubio Gomes de Lima, Jonas Maziero, Celso Jorge Villas-Boas
TL;DR
MixFunn introduces a physics-informed neural network architecture that combines mixed-function neurons and second-order interactions to solve differential equations with high accuracy, strong generalization, and interpretable analytical expressions. By using softmax-based function selection, dropout, and pruning, the method achieves drastic parameter reductions (up to four orders of magnitude vs. conventional PINNs) while maintaining or improving performance across classical, quantum, and fluid dynamics problems. Empirical results on the damped/forced harmonic oscillator, Burgers’ equation, and the quantum infinite well show improved extrapolation to unseen domains and the extraction of compact analytical forms, with hybrid second-order enhancements further boosting efficiency. The work suggests a compelling path toward efficient, interpretable, and generalizable solver architectures for complex physical systems.
Abstract
We introduce MixFunn, a novel neural network architecture designed to solve differential equations with enhanced precision, interpretability, and generalization capability. The architecture comprises two key components: the mixed-function neuron, which integrates multiple parameterized nonlinear functions to improve representational flexibility, and the second-order neuron, which combines a linear transformation of its inputs with a quadratic term to capture cross-combinations of input variables. These features significantly enhance the expressive power of the network, enabling it to achieve comparable or superior results with drastically fewer parameters and a reduction of up to four orders of magnitude compared to conventional approaches. We applied MixFunn in a physics-informed setting to solve differential equations in classical mechanics, quantum mechanics, and fluid dynamics, demonstrating its effectiveness in achieving higher accuracy and improved generalization to regions outside the training domain relative to standard machine learning models. Furthermore, the architecture facilitates the extraction of interpretable analytical expressions, offering valuable insights into the underlying solutions.
