Table of Contents
Fetching ...

Knowledge-aware equation discovery with automated background knowledge extraction

Elizaveta Ivanchik, Alexander Hvatov

TL;DR

This work tackles the challenge of discovering unknown differential equations by moving beyond fixed-structure coefficient recovery to a knowledge-guided search that can uncover new equation forms. It introduces a term-preference distribution derived from background knowledge, and an autonomous extraction pipeline (via a SymNet-based initial guess) to bias evolutionary operators without hardening the search space. The approach improves stability and robustness of equation discovery compared to SINDy, especially for complex PDEs like the inhomogeneous KdV, Burgers with viscosity, and the wave equation, under substantial noise. The method trades some computational speed for greater flexibility and discovery power, offering a practical pathway to data-driven discovery of physical models in scenarios where prior knowledge is available but not fully constraining.

Abstract

In differential equation discovery algorithms, a priori expert knowledge is mainly used implicitly to constrain the form of the expected equation, making it impossible for the algorithm to truly discover equations. Instead, most differential equation discovery algorithms try to recover the coefficients for a known structure. In this paper, we describe an algorithm that allows the discovery of unknown equations using automatically or manually extracted background knowledge. Instead of imposing rigid constraints, we modify the structure space so that certain terms are likely to appear within the crossover and mutation operators. In this way, we mimic expertly chosen terms while preserving the possibility of obtaining any equation form. The paper shows that the extraction and use of knowledge allows it to outperform the SINDy algorithm in terms of search stability and robustness. Synthetic examples are given for Burgers, wave, and Korteweg--De Vries equations.

Knowledge-aware equation discovery with automated background knowledge extraction

TL;DR

This work tackles the challenge of discovering unknown differential equations by moving beyond fixed-structure coefficient recovery to a knowledge-guided search that can uncover new equation forms. It introduces a term-preference distribution derived from background knowledge, and an autonomous extraction pipeline (via a SymNet-based initial guess) to bias evolutionary operators without hardening the search space. The approach improves stability and robustness of equation discovery compared to SINDy, especially for complex PDEs like the inhomogeneous KdV, Burgers with viscosity, and the wave equation, under substantial noise. The method trades some computational speed for greater flexibility and discovery power, offering a practical pathway to data-driven discovery of physical models in scenarios where prior knowledge is available but not fully constraining.

Abstract

In differential equation discovery algorithms, a priori expert knowledge is mainly used implicitly to constrain the form of the expected equation, making it impossible for the algorithm to truly discover equations. Instead, most differential equation discovery algorithms try to recover the coefficients for a known structure. In this paper, we describe an algorithm that allows the discovery of unknown equations using automatically or manually extracted background knowledge. Instead of imposing rigid constraints, we modify the structure space so that certain terms are likely to appear within the crossover and mutation operators. In this way, we mimic expertly chosen terms while preserving the possibility of obtaining any equation form. The paper shows that the extraction and use of knowledge allows it to outperform the SINDy algorithm in terms of search stability and robustness. Synthetic examples are given for Burgers, wave, and Korteweg--De Vries equations.
Paper Structure (40 sections, 23 equations, 20 figures, 15 tables, 1 algorithm)

This paper contains 40 sections, 23 equations, 20 figures, 15 tables, 1 algorithm.

Figures (20)

  • Figure 1: Model visualization: $T_i$ are the token products from Eq. \ref{['eq:model']} and $t_i$ are the tokens from Eq. \ref{['eq:token']}.
  • Figure 2: The classical algorithm cross-over. All terms have the same probability of participating in the cross-over.
  • Figure 3: The classical algorithm mutation. New tokens a) and new term b) are generated using a uniform distribution.
  • Figure 4: Modified cross-over. Terms have a different probability of participating in the cross-over; for illustration, the most probable terms win.
  • Figure 5: Modified mutation. New token a) is chosen using the importance distribution (for illustration, the most probable token is taken), and new term b) is generated using the importance distribution.
  • ...and 15 more figures