Table of Contents
Fetching ...

Equation-informed data-driven identification of flow budgets and dynamics

Nataliya Sevryugina, Serena Costanzo, Stephen de Bruyn Kops, Colm-cille Caulfield, Iraj Mortazavi, Taraneh Sayadi

TL;DR

The paper introduces Budget Identification Algorithm (BIA), a physics-informed clustering framework that uses pointwise SINDy to extract equation-budget features and then applies Newman modularity clustering to identify dynamically distinct flow regions. It extends to a dynamic Lagrangian formulation (Dynamic-BIA) that tracks cluster evolution as flow structures move, demonstrated on flow around a cylinder and on turbulent stratified flows using TBV and TKE budgets. Results show interpretable clusters corresponding to physically meaningful regimes and indicate TBV provides clearer separation for buoyancy-driven dynamics in stratified turbulence. The work enables interpretable, region-specific model selection and paves the way for hybrid CFD strategies that adapt fidelity by region and time, with open-source tooling potential via PySINDy integration.

Abstract

Computational Fluid Dynamics (CFD) is an indispensable method of fluid modelling in engineering applications, reducing the need for physical prototypes and testing for tasks such as design optimisation and performance analysis. Depending on the complexity of the system under consideration, models ranging from low to high fidelity can be used for prediction, allowing significant speed-up. However, the choice of model requires information about the actual dynamics of the flow regime. Correctly identifying the regions/clusters of flow that share the same dynamics has been a challenging research topic to date. In this study, we propose a novel hybrid approach to flow clustering. It consists of characterising each sample point of the system with equation-based features, i.e. features are budgets that represent the contribution of each term from the original governing equation to the local dynamics at each sample point. This was achieved by applying the Sparse Identification of Nonlinear Dynamical systems (SINDy) method pointwise to time evolution data. The method proceeds with equation-based clustering using the Girvan-Newman algorithm. This allows the detection of communities that share the same physical dynamics. The algorithm is implemented in both Eulerian and Lagrangian frameworks. In the Lagrangian, i.e. dynamic approach, the clustering is performed on the trajectory of each point, allowing the change of clusters to be represented also in time. The performance of the algorithm is first tested on a flow around a cylinder. The construction of the dynamic clusters in this test case clearly shows the evolution of the wake from the steady state solution through the transient to the oscillatory solution. Dynamic clustering was then successfully tested on turbulent flow data. Two distinct and well-defined clusters were identified and their temporal evolution was reconstructed.

Equation-informed data-driven identification of flow budgets and dynamics

TL;DR

The paper introduces Budget Identification Algorithm (BIA), a physics-informed clustering framework that uses pointwise SINDy to extract equation-budget features and then applies Newman modularity clustering to identify dynamically distinct flow regions. It extends to a dynamic Lagrangian formulation (Dynamic-BIA) that tracks cluster evolution as flow structures move, demonstrated on flow around a cylinder and on turbulent stratified flows using TBV and TKE budgets. Results show interpretable clusters corresponding to physically meaningful regimes and indicate TBV provides clearer separation for buoyancy-driven dynamics in stratified turbulence. The work enables interpretable, region-specific model selection and paves the way for hybrid CFD strategies that adapt fidelity by region and time, with open-source tooling potential via PySINDy integration.

Abstract

Computational Fluid Dynamics (CFD) is an indispensable method of fluid modelling in engineering applications, reducing the need for physical prototypes and testing for tasks such as design optimisation and performance analysis. Depending on the complexity of the system under consideration, models ranging from low to high fidelity can be used for prediction, allowing significant speed-up. However, the choice of model requires information about the actual dynamics of the flow regime. Correctly identifying the regions/clusters of flow that share the same dynamics has been a challenging research topic to date. In this study, we propose a novel hybrid approach to flow clustering. It consists of characterising each sample point of the system with equation-based features, i.e. features are budgets that represent the contribution of each term from the original governing equation to the local dynamics at each sample point. This was achieved by applying the Sparse Identification of Nonlinear Dynamical systems (SINDy) method pointwise to time evolution data. The method proceeds with equation-based clustering using the Girvan-Newman algorithm. This allows the detection of communities that share the same physical dynamics. The algorithm is implemented in both Eulerian and Lagrangian frameworks. In the Lagrangian, i.e. dynamic approach, the clustering is performed on the trajectory of each point, allowing the change of clusters to be represented also in time. The performance of the algorithm is first tested on a flow around a cylinder. The construction of the dynamic clusters in this test case clearly shows the evolution of the wake from the steady state solution through the transient to the oscillatory solution. Dynamic clustering was then successfully tested on turbulent flow data. Two distinct and well-defined clusters were identified and their temporal evolution was reconstructed.

Paper Structure

This paper contains 16 sections, 30 equations, 10 figures, 1 table, 2 algorithms.

Figures (10)

  • Figure 1: Vorticity field for a flow around a cylinder, $Re=200$, selection of three random points in the domain that would represent different flow regimes costanzo2022.
  • Figure 2: Identified active terms of the library of candidate functions for three random points costanzo2022.
  • Figure 3: Dynamics identification on a two dimensional flow around a cylinder at $Re=200$. (a) the selected points are divided into three main communities sharing same dynamics; (b) rearranged adjacency matrix showing correlations between communities; (c) example of active coefficients identified for points belonging to different communities, in the figure the values are normalized for each point costanzo2022.
  • Figure 4: Application of BIA algorithm for NS equation equation on flow around the cylinder: Newman clustering of the coefficients of the NS equation (4 clusters, yellow/purple/green/blue) (left), representation of the $x$-coordinate velocity (center), evolution of the total kinetic energy of the flow over time with red dot marking the time instant for each row (right).
  • Figure 5: Application of BIA algorithm for NS equation: Representation of active coefficients for purple/blue/green/yellow clusters from Figure \ref{['fig:NS_BIA_cluster']} for NS equation. Each cluster is assigned a colormap of corresponding color representing the level of contribution of each term; darker tones correspond to maximum ($max=1$) and lighter tones to minimum ($min=0$) contributions.
  • ...and 5 more figures