A Semidefinite Relaxation for Sums of Heterogeneous Quadratic Forms on the Stiefel Manifold

Kyle Gilman; Sam Burer; Laura Balzano

A Semidefinite Relaxation for Sums of Heterogeneous Quadratic Forms on the Stiefel Manifold

Kyle Gilman, Sam Burer, Laura Balzano

TL;DR

This work introduces a convex SDP relaxation for maximizing sums of heterogeneous quadratic forms on the Stiefel manifold and a practical dual certificate to verify global optimality of a candidate solution. It shows that the relaxation is tight in the close-to jointly diagonalizable regime and proves that heteroscedastic probabilistic PCA (HPPCA) data satisfy this property under realistic data regimes. The dual certificate reduces the verification to a low-dimensional LMI, enabling efficient global optimality checks alongside first-order nonconvex solvers, with empirical results validating both the certificate and tightness in practice. The findings provide rigorous, scalable guarantees for nonconvex subspace learning problems and highlight the HPPCA setting as a notable instance where the SDP is tight with high probability as data grow or noise homogenizes.

Abstract

We study the maximization of sums of heterogeneous quadratic forms over the Stiefel manifold, a nonconvex problem that arises in several modern signal processing and machine learning applications such as heteroscedastic probabilistic principal component analysis (HPPCA). In this work, we derive a novel semidefinite program (SDP) relaxation of the original problem and study a few of its theoretical properties. We prove a global optimality certificate for the original nonconvex problem via a dual certificate, which leads to a simple feasibility problem to certify global optimality of a candidate solution on the Stiefel manifold. In addition, our relaxation reduces to an assignment linear program for jointly diagonalizable problems and is therefore known to be tight in that case. We generalize this result to show that it is also tight for close-to jointly diagonalizable problems, and we show that the HPPCA problem has this characteristic. Numerical results validate our global optimality certificate and sufficient conditions for when the SDP is tight in various problem settings.

A Semidefinite Relaxation for Sums of Heterogeneous Quadratic Forms on the Stiefel Manifold

TL;DR

Abstract

Paper Structure (41 sections, 35 theorems, 110 equations, 4 figures, 4 tables)

This paper contains 41 sections, 35 theorems, 110 equations, 4 figures, 4 tables.

Introduction
Notation
Semidefinite program relaxation
Related work
Theoretical Results
Dual certificate of the SDP
SDP tightness in the close-to jointly diagonalizable (CJD) case
Continuity and tightness in the CJD case
HPPCA possesses the CJD property
Numerical experiments
Assessing the rank-one property (ROP)
Assessing global optimality of local solutions
Synthetic CJD matrices
HPPCA
Computation time
...and 26 more sections

Key Result

Lemma 2.1

\newlabellem:strongdualityholds0 If $k < d$, strong duality holds for the SDP relaxation with primal eq:primal_problem and dual eq:dual_problem.

Figures (4)

Figure 1: Numerical simulations for synthetic CJD matrices for $d=10, k=3$ with increasing $\sigma$ and 100 random problem instances for each setting. As $\sigma$ grows, the max commuting distance grows.
Figure 2: Numerical simulations for $\bmM_i$ generated by the HPPCA model in \ref{['eq:hppca:generative_model']} for $d= 50$, $k = 5$, noise variances $\bmv = [1,4]$, and $\bmlambda = [4, 3.25, 2.5, 1.75, 1]$ with increasing samples $n$. As $n$ grows, the max commuting distance gets smaller.
Figure 3: Percentages of global certification of StMM solutions out of 100 trials. The fractions not shown are tight instances certified as global.
Figure 4: Computation time of \ref{['eq:primal_problem']} versus StMM for 2000 iterations with global certificate check \ref{['eq:dual_certificate']} for HPPCA problems as the data dimension varies. We used $\bmv = [1,4]$, and $\bmn=[100,400]$ and made $\bmlambda$ a $k$-length vector with entries equally spaced in the interval $[1,4]$, where the rank of the model is $k$. Markers indicate the median computation time taken over 10 trials, and error bars show the standard deviation. Due to memory and computation limitations for $d=300$, we only performed one timing test for $k=3$ and $k=5$.

Theorems & Definitions (71)

Lemma 2.1
Definition 2.2: Rank-one property (ROP)
Lemma 2.3
Lemma 2.4
Theorem 4.1
Corollary 4.2
Definition 4.3: Close-to jointly diagonalizable (CJD)
Lemma 4.4
Definition 4.5
Theorem 4.6
...and 61 more

A Semidefinite Relaxation for Sums of Heterogeneous Quadratic Forms on the Stiefel Manifold

TL;DR

Abstract

A Semidefinite Relaxation for Sums of Heterogeneous Quadratic Forms on the Stiefel Manifold

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (71)