DNNLasso: Scalable Graph Learning for Matrix-Variate Data

Meixia Lin; Yangjing Zhang

DNNLasso: Scalable Graph Learning for Matrix-Variate Data

Meixia Lin, Yangjing Zhang

TL;DR

This work addresses scalable learning of row- and column-wise dependencies in matrix-variate observations by modeling the precision matrix as a KS sum, $\Sigma^{-1}=\Omega\oplus\Gamma$, which reduces dimensionality and preserves structure. The authors propose DNNLasso, a diagonally non-negative graphical lasso that uses an ADMM framework with explicit proximal operators for the KS log-determinant, yielding robust and fast optimization. Key contributions include the diagonally non-negative constraint to resolve identifiability and ensure bounded solutions, a provably convergent ADMM algorithm, and closed-form proximal updates that enable scalable learning on large graphs. Empirical results on synthetic data and COIL100 video data demonstrate that DNNLasso outperforms state-of-the-art KS estimators in both accuracy and computational efficiency, offering a practical tool for structure learning in high-dimensional matrix-variate settings.

Abstract

We consider the problem of jointly learning row-wise and column-wise dependencies of matrix-variate observations, which are modelled separately by two precision matrices. Due to the complicated structure of Kronecker-product precision matrices in the commonly used matrix-variate Gaussian graphical models, a sparser Kronecker-sum structure was proposed recently based on the Cartesian product of graphs. However, existing methods for estimating Kronecker-sum structured precision matrices do not scale well to large scale datasets. In this paper, we introduce DNNLasso, a diagonally non-negative graphical lasso model for estimating the Kronecker-sum structured precision matrix, which outperforms the state-of-the-art methods by a large margin in both accuracy and computational time. Our code is available at https://github.com/YangjingZhang/DNNLasso.

DNNLasso: Scalable Graph Learning for Matrix-Variate Data

TL;DR

This work addresses scalable learning of row- and column-wise dependencies in matrix-variate observations by modeling the precision matrix as a KS sum,

, which reduces dimensionality and preserves structure. The authors propose DNNLasso, a diagonally non-negative graphical lasso that uses an ADMM framework with explicit proximal operators for the KS log-determinant, yielding robust and fast optimization. Key contributions include the diagonally non-negative constraint to resolve identifiability and ensure bounded solutions, a provably convergent ADMM algorithm, and closed-form proximal updates that enable scalable learning on large graphs. Empirical results on synthetic data and COIL100 video data demonstrate that DNNLasso outperforms state-of-the-art KS estimators in both accuracy and computational efficiency, offering a practical tool for structure learning in high-dimensional matrix-variate settings.

Abstract

Paper Structure (22 sections, 5 theorems, 31 equations, 10 figures, 2 tables, 2 algorithms)

This paper contains 22 sections, 5 theorems, 31 equations, 10 figures, 2 tables, 2 algorithms.

INTRODUCTION
Related Works
Contributions
Notation
ESTIMATION OF A KRONECKER-SUM PRECISION MATRIX
Kronecker-Sum Structured Precision Matrix
Equivalent Formulation of \ref{['eq:KS']}
DNNLASSO
Alternating Direction Method of Multipliers
Proximal Operators Associated with the Negative Log-determinant KS Function
The Full Algorithm
NUMERICAL EXPERIMENTS
Synthetic Data
COIL100 Video Data
CONCLUSION
...and 7 more sections

Key Result

Proposition 2

Problems eq:KS and eq:KS_NN are equivalent in the following sense: (a) they share the same optimal objective function value; (b) any optimal solution to eq:KS is optimal to eq:KS_NN; (c) if $(\Gamma^*,\Omega^*)$ is an optimal solution to eq:KS_NN, then with $c= (\lambda_{\rm min}(\Gamma^*)-\lambda_{\rm min}(\Omega^*))/2$, is an optimal solution to eq:KS.

Figures (10)

Figure 1: Relative error (a,c) / Fscore (b,d) against $\lambda_0$ for synthetic graphs of Type 2 with dimension $s=t=500$ (a,b) / $s=100,t=500$ (c,d), and sample size $n = 1$, $st/10000$ or $st/100$.
Figure 2: Relative objective function value $(f^k - f^*)/f^*$ against time on synthetic graphs of Type 2 with dimension $s=t=500$ (left column) / $s=100,t=500$ (right column), and sample size $n = 1$, $st/10000$ or $st/100$ (rows from upper to lower).
Figure 3: Relative objective function values against time on graphs of Type 1 with sample size $n=1$ and dimensions $s=t=1000$ (upper row) / $s=1000,t=400$ (bottom row), $\lambda_0=10^{-2}$ (left column) / $\lambda_0=10^{-1.6}$ (right column).
Figure 4: A rotating box of cold medicine in COIL100 video data. First row: original resolution of $128\times 128$ pixels. Second (resp. third) row: reduced resolution of $32\times 32$ (resp. $8\times 8$) pixels.
Figure 5: On $s=72$ frames with $t=32\times 32$ pixels. (a) The BIC and sparsity level against $\lambda_0$. (b) The relative objective function value against computational time. (c) Sparsity pattern of the matrix $\widetilde{\Omega}\in\mathbb{S}^s$ estimated by DNNLasso (i.e., the correlation pattern among frames from different angles). (d) Relationship graph of frames from different angles.
...and 5 more figures

Theorems & Definitions (8)

Proposition 2
Theorem 3
Theorem 4
Proposition 5
proof
proof
proof
Proposition 6

DNNLasso: Scalable Graph Learning for Matrix-Variate Data

TL;DR

Abstract

DNNLasso: Scalable Graph Learning for Matrix-Variate Data

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (8)