Learning the hub graphical Lasso model with the structured sparsity via an efficient algorithm
Chengjing Wang, Peipei Tang, Wenling He, Meixia Lin
TL;DR
This work tackles learning discriminated hub graphical models (DHGL) in high dimensions, where conventional pADMM methods struggle with accuracy and speed. It introduces a two-phase optimization framework: Phase I uses a dual ADMM (dADMM) to generate a strong initialization, and Phase II applies an augmented Lagrangian method (ALM) with a semismooth Newton (SSN) inner solver to obtain highly accurate solutions by exploiting hub-induced sparsity via a surrogate generalized Jacobian. The authors provide convergence analyses for the dADMM, ALM, and SSN procedures and validate the approach through extensive synthetic and real-world experiments, showing up to 70% reductions in runtime while maintaining high-quality estimates. The results demonstrate scalable, accurate inference for hub-structured Gaussian graphical models with broad applicability to biological networks, social graphs, and financial portfolios.
Abstract
Graphical models have exhibited their performance in numerous tasks ranging from biological analysis to recommender systems. However, graphical models with hub nodes are computationally difficult to fit, particularly when the dimension of the data is large. To efficiently estimate the hub graphical models, we introduce a two-phase algorithm. The proposed algorithm first generates a good initial point via a dual alternating direction method of multipliers (ADMM), and then warm starts a semismooth Newton (SSN) based augmented Lagrangian method (ALM) to compute a solution that is accurate enough for practical tasks. We fully excavate the sparsity structure of the generalized Jacobian arising from the hubs in the graphical models, which ensures that the algorithm can obtain a nice solution very efficiently. Comprehensive experiments on both synthetic data and real data show that it obviously outperforms the existing state-of-the-art algorithms. In particular, in some high dimensional tasks, it can save more than 70\% of the execution time, meanwhile still achieves a high-quality estimation.
