Table of Contents
Fetching ...

Smoothing Gradient Tracking for Decentralized Optimization over the Stiefel Manifold with Non-smooth Regularizers

Lei Wang, Xin Liu

TL;DR

This work tackles decentralized optimization over the Stiefel manifold with non-smooth regularizers by smoothing the non-smooth terms through the Moreau envelope, enabling the use of smooth-optimization techniques in a distributed setting. The authors propose THANOS, a decentralized smoothing gradient-tracking algorithm that operates with a fixed smoothing parameter and achieves an $\mathcal{O}(\epsilon^{-4})$ iteration complexity to reach an $\epsilon$-stationary point of the original problem. They establish rigorous convergence guarantees and demonstrate empirical effectiveness on sparse PCA, highlighting the method's potential for large-scale networked problems with orthogonality constraints. The approach offers a practical path for non-smooth, non-convex optimization on manifolds in distributed systems, with implications for signal processing and machine learning tasks requiring orthogonal representations.

Abstract

Recently, decentralized optimization over the Stiefel manifold has attacked tremendous attentions due to its wide range of applications in various fields. Existing methods rely on the gradients to update variables, which are not applicable to the objective functions with non-smooth regularizers, such as sparse PCA. In this paper, to the best of our knowledge, we propose the first decentralized algorithm for non-smooth optimization over Stiefel manifolds. Our algorithm approximates the non-smooth part of objective function by its Moreau envelope, and then existing algorithms for smooth optimization can be deployed. We establish the convergence guarantee with the iteration complexity of $\mathcal{O} (ε^{-4})$. Numerical experiments conducted under the decentralized setting demonstrate the effectiveness and efficiency of our algorithm.

Smoothing Gradient Tracking for Decentralized Optimization over the Stiefel Manifold with Non-smooth Regularizers

TL;DR

This work tackles decentralized optimization over the Stiefel manifold with non-smooth regularizers by smoothing the non-smooth terms through the Moreau envelope, enabling the use of smooth-optimization techniques in a distributed setting. The authors propose THANOS, a decentralized smoothing gradient-tracking algorithm that operates with a fixed smoothing parameter and achieves an iteration complexity to reach an -stationary point of the original problem. They establish rigorous convergence guarantees and demonstrate empirical effectiveness on sparse PCA, highlighting the method's potential for large-scale networked problems with orthogonality constraints. The approach offers a practical path for non-smooth, non-convex optimization on manifolds in distributed systems, with implications for signal processing and machine learning tasks requiring orthogonal representations.

Abstract

Recently, decentralized optimization over the Stiefel manifold has attacked tremendous attentions due to its wide range of applications in various fields. Existing methods rely on the gradients to update variables, which are not applicable to the objective functions with non-smooth regularizers, such as sparse PCA. In this paper, to the best of our knowledge, we propose the first decentralized algorithm for non-smooth optimization over Stiefel manifolds. Our algorithm approximates the non-smooth part of objective function by its Moreau envelope, and then existing algorithms for smooth optimization can be deployed. We establish the convergence guarantee with the iteration complexity of . Numerical experiments conducted under the decentralized setting demonstrate the effectiveness and efficiency of our algorithm.
Paper Structure (16 sections, 5 theorems, 40 equations, 2 figures, 1 algorithm)

This paper contains 16 sections, 5 theorems, 40 equations, 2 figures, 1 algorithm.

Key Result

Proposition 1

Let $g: \mathbb{R}^{n \times p} \to \mathbb{R}$ be a proper, convex and lower semi-continuous function. Suppose $g$ is Lipschitz continuous with the corresponding Lipschitz constant $L \geq 0$. Then for any $\sigma > 0$, it holds that

Figures (2)

  • Figure 1: Numerical performance of THANOS for different values of $\sigma$ on sparse PCA problems with $r (X) = \left\|X\right\|_1$.
  • Figure 2: Numerical performance of THANOS for different values of $\sigma$ on sparse PCA problems with $r (X) = \left\|X\right\|_{2, 1}$.

Theorems & Definitions (12)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Proposition 1: Bohm2021variable
  • Proposition 2: Bohm2021variable
  • Lemma 3
  • proof
  • Proposition 4
  • proof
  • ...and 2 more