Table of Contents
Fetching ...

Data-driven functional state estimation of complex networks

Yuan Zhang, Ziyuan Luo, Wenxuan Xu, Jiayu Wu, Wenqi Cao, Ranbo Cheng, Tingting Qin, Yuanqing Xia, Mohamed Darouach, Aming Li, Tyrone Fernando

TL;DR

This work introduces a data-driven framework for estimating a targeted set of state variables, known as functional observers, without identifying the model parameters, and establishes a fundamental functional observability criterion based on historical trajectories that guarantees the existence of such observers.

Abstract

The internal state of a dynamical system, a set of variables that defines its evolving configuration, is often hidden and cannot be fully measured, posing a central challenge for real-time monitoring and control. While observers are designed to estimate these latent states from sensor outputs, their classical designs rely on precise system models, which are often unattainable for complex network systems. Here, we introduce a data-driven framework for estimating a targeted set of state variables, known as functional observers, without identifying the model parameters. We establish a fundamental functional observability criterion based on historical trajectories that guarantees the existence of such observers. We then develop methods to construct observers using either input-output data or partial state data. These observers match or exceed the performance of model-based counterparts while remaining applicable even to unobservable systems. The framework incorporates noise mitigation and can be easily extended to nonlinear networks via Koopman embeddings. We demonstrate its broad utility through applications including sensor fault detection in water networks, load-frequency control in power grids, and target estimation in nonlinear neuronal systems. Our work provides a practical route for real-time target state inference in complex systems where models are unavailable.

Data-driven functional state estimation of complex networks

TL;DR

This work introduces a data-driven framework for estimating a targeted set of state variables, known as functional observers, without identifying the model parameters, and establishes a fundamental functional observability criterion based on historical trajectories that guarantees the existence of such observers.

Abstract

The internal state of a dynamical system, a set of variables that defines its evolving configuration, is often hidden and cannot be fully measured, posing a central challenge for real-time monitoring and control. While observers are designed to estimate these latent states from sensor outputs, their classical designs rely on precise system models, which are often unattainable for complex network systems. Here, we introduce a data-driven framework for estimating a targeted set of state variables, known as functional observers, without identifying the model parameters. We establish a fundamental functional observability criterion based on historical trajectories that guarantees the existence of such observers. We then develop methods to construct observers using either input-output data or partial state data. These observers match or exceed the performance of model-based counterparts while remaining applicable even to unobservable systems. The framework incorporates noise mitigation and can be easily extended to nonlinear networks via Koopman embeddings. We demonstrate its broad utility through applications including sensor fault detection in water networks, load-frequency control in power grids, and target estimation in nonlinear neuronal systems. Our work provides a practical route for real-time target state inference in complex systems where models are unavailable.

Paper Structure

This paper contains 4 sections, 34 equations, 6 figures, 1 algorithm.

Figures (6)

  • Figure 1: Two-step designed observers versus the proposed direct data-driven functional observer for not fully observable systems.a The general structure of an observer, which uses real-time input and output data to estimate the system's internal state. The illustrative network has $n=10$ nodes with edge weights randomly drawn from a uniform distribution on $(0,1)$. The system has $m=3$ inputs (blue nodes), $p=2$ outputs (red nodes), and a single target functional state $r=1$ (yellow node). This system is controllable and functionally observable, but not fully state observable (${\rm rank}(\mathbf{O(A,C)})={\rm rank}(\left[\mathbf{O(A,C)^\intercal},\mathbf{O(A,F)^\intercal}\right]^\intercal)= 8 < n = 10$). b Network topologies identified from a single noisy input-output trajectory (the output is contaminated by independent identically distributed (i.i.d.) white noise with variance $0.01$) using a subspace-based method ("Methods"), assuming different prior state dimensions (9 and 10), where $({\hat{\mathbf A}}, \hat{\mathbf B}, \hat{\mathbf C}, \hat{\mathbf F})$ are the identified parameters ($(\hat{\mathbf C}, \hat{\mathbf F})$ is collectively identified by regarding $[\mathbf y^\intercal(t),\mathbf z^\intercal(t)]^\intercal$ as the output). Corresponding Luenberger observers were designed for full-state estimation (see Fig. S1 for trajectories of the observers). c The proposed data-driven framework for functional observer design. The observer parameters are learned directly from data and the dynamic equation iteration enables real-time estimation of the target state without intermediate full-state reconstruction. d Performance comparison. Top: Orders of the designed observers. The proposed functional observer is of lower order. Bottom: RRMSE of the functional state estimation over 100 independent experiments. Here, the functional observer is constructed from the noisy historical data using IO-data based design detailed subsequently. The RRMSE was computed over the time period $[50,100]$ to exclude transient effects. For each experiment, the initial state of the original system was randomly initialized, while all observers started from zero.
  • Figure 2: Performance comparison between data-driven and identification-and-model-based functional observer designs.a,b RRMSE distribution for data-driven (purple) and ID+model-based (blue) observers in Erdős--Rényi (a) and Barabási--Albert (b) networks. Each data point represents an average over 100 independent network realizations. c Computational time required for observer design in ER (left) and BA (right) networks across varying system dimensions $2n$. d Average observer order and standard deviation for both design methods. Benchmarks were performed on randomly generated ER and BA networks with $2n \in [100,200]$ states (each node is a 2-dimensional subsystem; see Methods). Systems have $m = \lfloor 2n/5 \rfloor$ inputs, $p = \lfloor n/2 \rfloor$ outputs, and $r = \lfloor n/10 \rfloor$ functional states, with dedicated node assignments, where $\lfloor a \rfloor$ takes the floor of $a$. All networks have average degree $\langle k\rangle=10$ and BA networks have the initial number of nodes $m_0=20$. Systems are stabilized by normalizing the state matrix by $1.01\rho(\mathbf{A})$, where $\rho(\mathbf A)$ denotes the spectral radius of $\mathbf A$. The data-driven functional observer was trained directly on historical input-output data of length $T = T_{\rm IO}^* + 100$ (sufficient for system identification), while the model-based approach first identified system matrices using the MOESP algorithm (see "Methods") with true state dimensions given, followed by conventional functional observer designfernando2010functional. To test the observers' performance, the initial states of the original systems are sampled from the uniform distribution on $(-1,1)$ while all observers start from zero. RRMSE is computed over the time period $[T,2T]$ with step input $u_i(t)=1, \forall t>0,i=1,2,...,m$.
  • Figure 3: Observer performance with partial state information.a RRMSE as a function of the percentage of partial information. b Average observer orders across different information levels. c Percentage of convergent realizations as a function of the percentage of partial information. Lines correspond to the extended-state based design with data length $T=800$, while dots correspond to the IO-data based design with a much larger data length $T=3000$. We generate ER and BA random networks using the same settings as Fig. \ref{['fig:fig2']} with fixed parameters $2n=500$, $m=80$, $p=100$, $r=10$, and varying parameters $p_{\text{edge}}$ for ER networks and average degrees $\langle k \rangle$ for BA networks, but only select the observable network realizations to ensure that all partial state information ${\mathbf P}\mathbf X$ corresponds to the observable component. The rows of $\mathbf Z$ and $\mathbf W$ are uniformly sampled from rows of $\mathbf X$. The RRMSE is taken over the period $\left[T,2T\right]$ with step input $u_i(t)=1, \forall t>0,i=1,2,...,m$. All results are averaged over 100 independent realizations.
  • Figure 4: Sensor fault detection and recovery in a water network.a The structure of EPANET's network 3. Each edge is bidirectional (except the edges from input nodes to state nodes) and each state node has a self-loop which is omitted. The reservoirs whose water heads can be arbitrarily set by the controller serve as the system inputs (blue nodes), while the remaining 95 nodes evolve according to the network dynamics. b The trajectory of the target node's height (true functional state) and its estimations from four independent functional observers (FO), each using measurements of one normal sensor. c The trajectories of the aforementioned four functional observers, where sensor 1 is under attack. Red areas indicate the time intervals of sensor attack. d Estimations of the value of sensor 1 from four independent functional observers, where each uses measurement of one normal sensor to estimate the value of sensor 1. In our numerical experiments, we simulate the transport of engine oil at $10^\circ$C, with $\rho = 885\,\mathrm{kg/m^3}$, and $\mu = 582.95\,\mathrm{mPa\cdot s}$. Pipe diameters $D_{ij}$ are sampled uniformly from 10 to 20 m and lengths $L_{ij}$ from 3000 to 4000 m. Continuous-time dynamics are discretized using a forward-Euler scheme with a time step of $\Delta t = 0.01$ s. The resulting system matrix is normalized by its spectral radius to ensure numerical stability. The historical data are generated with Gaussian random inputs.
  • Figure 5: a Partition of the IEEE 39-bus system, with Area 1 in the bottom-left, Area 2 in the bottom-right, and Area 3 in the top. b The open-loop and closed-loop frequency deviations (green region, after $t=150s$) of Area 1 when a persistent $0.1$ step load disturbance is imposed on Area 1 at time $t=100s$ (red region). c The open-loop and closed-loop tie-line deviations of Area 1. d Control input and the error of learning a control law using functional observers after $t=150s$.
  • ...and 1 more figures