Table of Contents
Fetching ...

On Conditional Independence Graph Learning From Multi-Attribute Gaussian Dependent Time Series

Jitendra K. Tugnait

TL;DR

This work develops a unified framework for learning the conditional independence graph of high-dimensional, multi-attribute Gaussian time series by formulating a penalized log-likelihood in the frequency domain. It analyzes both convex (sparse-group lasso) and non-convex (log-sum, SCAD) penalties and establishes theoretical guarantees of consistency and graph recovery under high-dimensional scaling without requiring incoherence conditions. The optimization employs an ADMM-based approach with local linear approximation to handle non-convex penalties, and tuning parameters are selected via Bayesian information criterion. Empirical results on synthetic data and a Beijing air-quality dataset demonstrate that non-convex penalties, particularly log-sum, can yield sparser, more accurate CIGs and favorable trade-offs between recovery performance and computation.

Abstract

Estimation of the conditional independence graph (CIG) of high-dimensional multivariate Gaussian time series from multi-attribute data is considered. Existing methods for graph estimation for such data are based on single-attribute models where one associates a scalar time series with each node. In multi-attribute graphical models, each node represents a random vector or vector time series. In this paper we provide a unified theoretical analysis of multi-attribute graph learning for dependent time series using a penalized log-likelihood objective function formulated in the frequency domain using the discrete Fourier transform of the time-domain data. We consider both convex (sparse-group lasso) and non-convex (log-sum and SCAD group penalties) penalty/regularization functions. We establish sufficient conditions in a high-dimensional setting for consistency (convergence of the inverse power spectral density to true value in the Frobenius norm), local convexity when using non-convex penalties, and graph recovery. We do not impose any incoherence or irrepresentability condition for our convergence results. We also empirically investigate selection of the tuning parameters based on the Bayesian information criterion, and illustrate our approach using numerical examples utilizing both synthetic and real data.

On Conditional Independence Graph Learning From Multi-Attribute Gaussian Dependent Time Series

TL;DR

This work develops a unified framework for learning the conditional independence graph of high-dimensional, multi-attribute Gaussian time series by formulating a penalized log-likelihood in the frequency domain. It analyzes both convex (sparse-group lasso) and non-convex (log-sum, SCAD) penalties and establishes theoretical guarantees of consistency and graph recovery under high-dimensional scaling without requiring incoherence conditions. The optimization employs an ADMM-based approach with local linear approximation to handle non-convex penalties, and tuning parameters are selected via Bayesian information criterion. Empirical results on synthetic data and a Beijing air-quality dataset demonstrate that non-convex penalties, particularly log-sum, can yield sparser, more accurate CIGs and favorable trade-offs between recovery performance and computation.

Abstract

Estimation of the conditional independence graph (CIG) of high-dimensional multivariate Gaussian time series from multi-attribute data is considered. Existing methods for graph estimation for such data are based on single-attribute models where one associates a scalar time series with each node. In multi-attribute graphical models, each node represents a random vector or vector time series. In this paper we provide a unified theoretical analysis of multi-attribute graph learning for dependent time series using a penalized log-likelihood objective function formulated in the frequency domain using the discrete Fourier transform of the time-domain data. We consider both convex (sparse-group lasso) and non-convex (log-sum and SCAD group penalties) penalty/regularization functions. We establish sufficient conditions in a high-dimensional setting for consistency (convergence of the inverse power spectral density to true value in the Frobenius norm), local convexity when using non-convex penalties, and graph recovery. We do not impose any incoherence or irrepresentability condition for our convergence results. We also empirically investigate selection of the tuning parameters based on the Bayesian information criterion, and illustrate our approach using numerical examples utilizing both synthetic and real data.

Paper Structure

This paper contains 16 sections, 89 equations, 3 figures, 4 tables, 1 algorithm.

Figures (3)

  • Figure 1: True $\log_{10} (\sum_{f=0:0.01:5} | [S^{-1}(f)]_{ij} | )$, $i,j \in [256]$, for extended graphs for a single Monte Carlo run: $mp=4 \times 64 = 256$ nodes.
  • Figure 2: Pollution graphs for the Beijing air-quality dataset Zhang2017 for year 2013-14: 8 monitoring sites and 11 features ($m=8$, $p=11$, $M=4$, $n=364$). Number of distinct edges $=29$ and $7$ in graphs (a) and (b), respectively. Estimated $\|\hat{\bm \Omega}^{(ijM)}\|_F$ is the edge weight (normalized to have $\max_{i \ne j}\|\hat{\bm \Omega}^{(ijM)}\|_F=1$), see (\ref{['eqth2_20c']}). The edge weights are color coded , in addition to the edges with higher weights being drawn thicker.
  • Figure 3: Estimated $\log_{10} ( \sqrt{\sum_{k=1}^M | [\hat{\bm \Phi}_k]_{ij} |^2 } )$, $i,j \in [88]$, for the Beijing air-quality dataset ($m=8$, $p=11$, $M=4$, $n=364$). There are $p=11$ nodes (PM$_{2.5}$ labeled as node 1, PM$_{10}$ as 2, and so on, moving counter-clockwise in Fig. \ref{['figreal']}), each variables measured at $m=8$ stations.