Table of Contents
Fetching ...

Learning Sparse High-Dimensional Matrix-Valued Graphical Models From Dependent Data

Jitendra K Tugnait

TL;DR

This work tackles inferring the conditional independence graph of sparse, high-dimensional matrix-valued Gaussian time series under dependent observations. It introduces a frequency-domain penalized likelihood with a Kronecker-decomposable PSD and solves the resulting bi-convex program via a flip-flop algorithm built on two ADMM subroutines, yielding edges from zeros in the inverse PSD components ${\bm \Omega}$ and $\{ {\bm \Phi}_k \}$. Theoretical results establish local consistency and rate bounds for the inverse PSD estimators in the high-dimensional regime, ensuring reliable graph recovery under suitable conditions. Empirical results on synthetic data and the Beijing air-quality dataset demonstrate improved performance over iid-based methods and illustrate practical interpretability of the learned Kronecker-structured CIGs.

Abstract

We consider the problem of inferring the conditional independence graph (CIG) of a sparse, high-dimensional, stationary matrix-variate Gaussian time series. All past work on high-dimensional matrix graphical models assumes that independent and identically distributed (i.i.d.) observations of the matrix-variate are available. Here we allow dependent observations. We consider a sparse-group lasso-based frequency-domain formulation of the problem with a Kronecker-decomposable power spectral density (PSD), and solve it via an alternating direction method of multipliers (ADMM) approach. The problem is bi-convex which is solved via flip-flop optimization. We provide sufficient conditions for local convergence in the Frobenius norm of the inverse PSD estimators to the true value. This result also yields a rate of convergence. We illustrate our approach using numerical examples utilizing both synthetic and real data.

Learning Sparse High-Dimensional Matrix-Valued Graphical Models From Dependent Data

TL;DR

This work tackles inferring the conditional independence graph of sparse, high-dimensional matrix-valued Gaussian time series under dependent observations. It introduces a frequency-domain penalized likelihood with a Kronecker-decomposable PSD and solves the resulting bi-convex program via a flip-flop algorithm built on two ADMM subroutines, yielding edges from zeros in the inverse PSD components and . Theoretical results establish local consistency and rate bounds for the inverse PSD estimators in the high-dimensional regime, ensuring reliable graph recovery under suitable conditions. Empirical results on synthetic data and the Beijing air-quality dataset demonstrate improved performance over iid-based methods and illustrate practical interpretability of the learned Kronecker-structured CIGs.

Abstract

We consider the problem of inferring the conditional independence graph (CIG) of a sparse, high-dimensional, stationary matrix-variate Gaussian time series. All past work on high-dimensional matrix graphical models assumes that independent and identically distributed (i.i.d.) observations of the matrix-variate are available. Here we allow dependent observations. We consider a sparse-group lasso-based frequency-domain formulation of the problem with a Kronecker-decomposable power spectral density (PSD), and solve it via an alternating direction method of multipliers (ADMM) approach. The problem is bi-convex which is solved via flip-flop optimization. We provide sufficient conditions for local convergence in the Frobenius norm of the inverse PSD estimators to the true value. This result also yields a rate of convergence. We illustrate our approach using numerical examples utilizing both synthetic and real data.
Paper Structure (19 sections, 104 equations, 2 figures, 1 table)

This paper contains 19 sections, 104 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: ROC curves: plots labeled "IID" are from the approach of Leng2012Tsiligkaridis2013Yin2012, and the plots labeled "dep." are from our proposed approach. TPR=true positive rate, TNR=true negative rate
  • Figure 2: Pollution and site graphs for the Beijing air-quality dataset Zhang2017 for year 2013-14: 8 monitoring sites and 11 features ($p=8$, $q=11$, $n=364$). Number of distinct edges $=18, \, 28, \, 20, \, 6, \, 30, \, 28$ in graphs (a), (b), (c), (d), (e) and (f), respectively. Monitoring sites labeled Stn. 1-4 are the rural/suburban sites and those labeled Stn. 5-8 are the urban sites (see the text). For the pollution graph, estimated $\hat{\bm \Phi}^{(ij)}$ is the edge weight (normalized to have $\max_{i \ne j}\hat{\bm \Phi}^{(ij)}=1$) and for the site graph, estimated $|\hat{\Omega}_{ij}|$ is the edge weight (normalized to have $\max_{i \ne j}|\hat{\Omega}_{ij}|=1$). The edge weights are color coded (all pollution graphs share the same color legend, and similarly for the site graphs), in addition to the edges with higher weights being drawn thicker.