Table of Contents
Fetching ...

Learning Graph Laplacian with MCP

Yangjing Zhang, Kim-Chuan Toh, Defeng Sun

TL;DR

This work addresses learning graph Laplacians under Laplacian constraints using a non-convex MCP penalty to promote sparsity in edge weights. It derives an inexact proximal DCA framework that reformulates the problem in edge-weight space and leverages a DC decomposition of MCP, with subproblems solved efficiently via a semismooth Newton method on the dual. The authors establish convergence to critical points and demonstrate, through extensive synthetic and real-data experiments, that MCP yields sparser, less biased graphs and better edge-recovery than competitive convex penalties, while achieving superior computational efficiency compared to existing non-convex solvers. The approach is generalizable to other non-convex penalties (e.g., SCAD) and has practical impact for scalable, interpretable graph learning in high dimensions.

Abstract

We consider the problem of learning a graph under the Laplacian constraint with a non-convex penalty: minimax concave penalty (MCP). For solving the MCP penalized graphical model, we design an inexact proximal difference-of-convex algorithm (DCA) and prove its convergence to critical points. We note that each subproblem of the proximal DCA enjoys the nice property that the objective function in its dual problem is continuously differentiable with a semismooth gradient. Therefore, we apply an efficient semismooth Newton method to subproblems of the proximal DCA. Numerical experiments on various synthetic and real data sets demonstrate the effectiveness of the non-convex penalty MCP in promoting sparsity. Compared with the existing state-of-the-art method, our method is demonstrated to be more efficient and reliable for learning graph Laplacian with MCP.

Learning Graph Laplacian with MCP

TL;DR

This work addresses learning graph Laplacians under Laplacian constraints using a non-convex MCP penalty to promote sparsity in edge weights. It derives an inexact proximal DCA framework that reformulates the problem in edge-weight space and leverages a DC decomposition of MCP, with subproblems solved efficiently via a semismooth Newton method on the dual. The authors establish convergence to critical points and demonstrate, through extensive synthetic and real-data experiments, that MCP yields sparser, less biased graphs and better edge-recovery than competitive convex penalties, while achieving superior computational efficiency compared to existing non-convex solvers. The approach is generalizable to other non-convex penalties (e.g., SCAD) and has practical impact for scalable, interpretable graph learning in high dimensions.

Abstract

We consider the problem of learning a graph under the Laplacian constraint with a non-convex penalty: minimax concave penalty (MCP). For solving the MCP penalized graphical model, we design an inexact proximal difference-of-convex algorithm (DCA) and prove its convergence to critical points. We note that each subproblem of the proximal DCA enjoys the nice property that the objective function in its dual problem is continuously differentiable with a semismooth gradient. Therefore, we apply an efficient semismooth Newton method to subproblems of the proximal DCA. Numerical experiments on various synthetic and real data sets demonstrate the effectiveness of the non-convex penalty MCP in promoting sparsity. Compared with the existing state-of-the-art method, our method is demonstrated to be more efficient and reliable for learning graph Laplacian with MCP.

Paper Structure

This paper contains 20 sections, 7 theorems, 58 equations, 15 figures, 3 tables, 2 algorithms.

Key Result

Lemma 2.1

Let $w^{k+1}$ be an approximate solution to problem subprob such that the stopping condition stop-cond holds. Then we have that If the sequence $\{f(w^k)\}$ is bounded below, then the sequence $\{f(w^k)\}$ converges to a finite number.

Figures (15)

  • Figure 1: On Erdős-Rényi graph, $\mathcal{G}^{(100,0.1)}_{\rm ER}$. The true connectivity matrix $A = A_{\rm true}$ is used.
  • Figure 2: On Erdős-Rényi graph, $\mathcal{G}^{(100,0.1)}_{\rm ER}$. We use a coarse estimation of the true sparsity pattern and $A=A_{\rm coarse}$.
  • Figure 3: On Erdős-Rényi graph, $\mathcal{G}^{(100,0.1)}_{\rm ER}$. We use a full connectivity matrix and $A=A_{\rm full}$ is input.
  • Figure 4: On Erdős-Rényi graph, $\mathcal{G}^{(100,0.1)}_{\rm ER}$. We use a rough estimation of the true sparsity pattern and $A = A_{\rm 10\%}$ which is not exactly accurate.
  • Figure 5: On animals data set. Left: the number of edges against the penalty parameter $\lambda$. Middle: dependency graph of the MCP solution with $\lambda=0$. Right: dependency graph of the MCP solution with $\lambda=10^{-0.25}$.
  • ...and 10 more figures

Theorems & Definitions (10)

  • Lemma 2.1
  • Theorem 2.2
  • Theorem 2.3
  • Definition 2.4: Subdifferentials
  • Definition 2.5: KL property
  • Theorem 2.6
  • Proposition 3.1
  • Remark 1
  • Lemma A.1
  • Proposition C.1