Information Theoretically Optimal Sample Complexity of Learning Dynamical Directed Acyclic Graphs

Mishfad Shaikh Veedu; Deepjyoti Deka; Murti V. Salapaka

Information Theoretically Optimal Sample Complexity of Learning Dynamical Directed Acyclic Graphs

Mishfad Shaikh Veedu, Deepjyoti Deka, Murti V. Salapaka

TL;DR

This work addresses the problem of identifying the directed structure of a dynamical DAG (DDAG) from time-series data generated by a linear dynamical system with equal PSD wide-sense stationary noise. It introduces a PSDM-based reconstruction algorithm that exploits the conditional PSD deficit metric $f(i,C,ω)$ to derive a topological ordering and recover each node's parents, under two sampling schemes. The authors establish non-asymptotic concentration bounds for the PSDM, derive an upper bound on the required samples $n$ that scales as $n=Θ\big( M^6 q \log(p/q) \big)$ (for fixed reconstruction tolerance), and prove a matching information-theoretic lower bound, proving order-optimality. These results quantify the fundamental sample complexity for exact DDAG learning and offer practical guidance for structure learning in dynamical networks under practical sampling constraints.

Abstract

In this article, the optimal sample complexity of learning the underlying interactions or dependencies of a Linear Dynamical System (LDS) over a Directed Acyclic Graph (DAG) is studied. We call such a DAG underlying an LDS as dynamical DAG (DDAG). In particular, we consider a DDAG where the nodal dynamics are driven by unobserved exogenous noise sources that are wide-sense stationary (WSS) in time but are mutually uncorrelated, and have the same {power spectral density (PSD)}. Inspired by the static DAG setting, a metric and an algorithm based on the PSD matrix of the observed time series are proposed to reconstruct the DDAG. It is shown that the optimal sample complexity (or length of state trajectory) needed to learn the DDAG is $n=Θ(q\log(p/q))$, where $p$ is the number of nodes and $q$ is the maximum number of parents per node. To prove the sample complexity upper bound, a concentration bound for the PSD estimation is derived, under two different sampling strategies. A matching min-max lower bound using generalized Fano's inequality also is provided, thus showing the order optimality of the proposed algorithm.

Information Theoretically Optimal Sample Complexity of Learning Dynamical Directed Acyclic Graphs

TL;DR

to derive a topological ordering and recover each node's parents, under two sampling schemes. The authors establish non-asymptotic concentration bounds for the PSDM, derive an upper bound on the required samples

that scales as

(for fixed reconstruction tolerance), and prove a matching information-theoretic lower bound, proving order-optimality. These results quantify the fundamental sample complexity for exact DDAG learning and offer practical guidance for structure learning in dynamical networks under practical sampling constraints.

Abstract

, where

is the number of nodes and

is the maximum number of parents per node. To prove the sample complexity upper bound, a concentration bound for the PSD estimation is derived, under two different sampling strategies. A matching min-max lower bound using generalized Fano's inequality also is provided, thus showing the order optimality of the proposed algorithm.

Paper Structure (20 sections, 16 theorems, 67 equations, 3 figures, 1 algorithm)

This paper contains 20 sections, 16 theorems, 67 equations, 3 figures, 1 algorithm.

Introduction
Related Work
System model of DDAG
Reconstructing DDAGs from PSDM
CPSD and Topological Ordering
Conditional PSD deficit
Finite Sample Analysis of Reconstructing DDAGs
Non-Asymptotic Estimation Error in Spectrogram Method
Sample Complexity Bounds: Upper Bound
Sample Complexity Bounds: Lower Bound
Simulation Experiments
Proof of Lemma 3.3
Proof of Lemma 3.8
Proof of Theorem 4.2
Restart and Record Sampling
...and 5 more sections

Key Result

Lemma 3.1

Consider the LDS described by eq:LDS. For any $\omega \in \Omega$, let $\alpha^*:=\min\limits_{k\in V}\Phi_{kk}(\omega)$. Then $\Phi_{ii}(\omega)=\alpha^*$ if and only if $i$ is a source node.

Figures (3)

Figure 1: An example DDAG. Node 1 is an ancestor and node 7 is a descendant of every node in the graph. The set $\{1,2,5\}$ is an ancestral set but $\{2,5\}$ is not. $an(3)=\{1,2\}$, $desc(3)=\{4,7\}$, $nd(3)=\{1,2,5,6\}$.
Figure 2: (a) shows restart and record sampling. (b) shows continuous sampling.
Figure 3: Probability of error with number of trajectories, for different networks under restart and record (RR) and continuous (conti) sampling strategies, when the system is excited by either i.i.d. or WSS noise. $p$ and $q$ refer to the number of nodes and degree of each node in the network, respectively. The trajectory length $N=64$ is fixed for each trajectory, and $\omega=\frac{17}{64}$.

Theorems & Definitions (24)

Remark 2.1
Definition 2.4
Lemma 3.1
Definition 3.2: The CPSD deficit
Lemma 3.3
Corollary 3.4
Lemma 3.5
proof
Lemma 3.6
proof
...and 14 more

Information Theoretically Optimal Sample Complexity of Learning Dynamical Directed Acyclic Graphs

TL;DR

Abstract

Information Theoretically Optimal Sample Complexity of Learning Dynamical Directed Acyclic Graphs

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (3)

Theorems & Definitions (24)