Table of Contents
Fetching ...

Prior distributions for structured semi-orthogonal matrices

Michael Jauch, Marie-Christine Düker, Peter Hoff

TL;DR

This work develops a Bayesian framework for priors on structured semi-orthogonal matrices $Q \in \mathcal{V}(k,p)$ by projecting an unconstrained matrix $X$ onto the Stiefel manifold and employing the MACG prior as a core component. It establishes invariance and Wasserstein-approximation results showing that structure embedded in $X$ transfers to $Q$, enabling explicit sparsity or smoothness through choices of $Z$ and a correlation matrix $\Omega$. The authors illustrate two applications—a sparse network eigenmodel for protein interactions and a smooth PCA for ocean oxygen data—demonstrating improved interpretability with competitive predictive performance. Computationally, posterior inference is achieved via parameter-expanded MCMC using polar expansion, aided by a scale-mixture representation of the shrinkage prior and MACG framework. This approach provides a principled, tractable way to incorporate domain-specific structure into semi-orthogonal matrix parameters across multivariate models.

Abstract

Statistical models for multivariate data often include a semi-orthogonal matrix parameter. In many applications, there is reason to expect that the semi-orthogonal matrix parameter satisfies a structural assumption such as sparsity or smoothness. From a Bayesian perspective, these structural assumptions should be incorporated into an analysis through the prior distribution. In this work, we introduce a general approach to constructing prior distributions for structured semi-orthogonal matrices that leads to tractable posterior inference via parameter-expanded Markov chain Monte Carlo. We draw on recent results from random matrix theory to establish a theoretical basis for the proposed approach. We then introduce specific prior distributions for incorporating sparsity or smoothness and illustrate their use through applications to biological and oceanographic data.

Prior distributions for structured semi-orthogonal matrices

TL;DR

This work develops a Bayesian framework for priors on structured semi-orthogonal matrices by projecting an unconstrained matrix onto the Stiefel manifold and employing the MACG prior as a core component. It establishes invariance and Wasserstein-approximation results showing that structure embedded in transfers to , enabling explicit sparsity or smoothness through choices of and a correlation matrix . The authors illustrate two applications—a sparse network eigenmodel for protein interactions and a smooth PCA for ocean oxygen data—demonstrating improved interpretability with competitive predictive performance. Computationally, posterior inference is achieved via parameter-expanded MCMC using polar expansion, aided by a scale-mixture representation of the shrinkage prior and MACG framework. This approach provides a principled, tractable way to incorporate domain-specific structure into semi-orthogonal matrix parameters across multivariate models.

Abstract

Statistical models for multivariate data often include a semi-orthogonal matrix parameter. In many applications, there is reason to expect that the semi-orthogonal matrix parameter satisfies a structural assumption such as sparsity or smoothness. From a Bayesian perspective, these structural assumptions should be incorporated into an analysis through the prior distribution. In this work, we introduce a general approach to constructing prior distributions for structured semi-orthogonal matrices that leads to tractable posterior inference via parameter-expanded Markov chain Monte Carlo. We draw on recent results from random matrix theory to establish a theoretical basis for the proposed approach. We then introduce specific prior distributions for incorporating sparsity or smoothness and illustrate their use through applications to biological and oceanographic data.
Paper Structure (15 sections, 5 theorems, 19 equations, 5 figures)

This paper contains 15 sections, 5 theorems, 19 equations, 5 figures.

Key Result

Theorem 3.1

If the random matrix $X$ is invariant to left multiplication by elements of $\mathcal{L}$ and right multiplication by elements of $\mathcal{R},$ then so is $Q_{X}$.

Figures (5)

  • Figure 1: A comparison of the marginal densities of $z_i$ and $\sqrt{p}q_i$ for $\ell=.1$ when $p=5$ and $p=100,$ as discussed in Example \ref{['ex:marginal']}.
  • Figure 2: A comparison of the columns of a single realization $X^*$ of $X$ with those of $\sqrt{p}\,Q_{X^*}$ for the sparsity-inducing prior with $\ell = .1$ and dimensions $p=100, k=3$ as described in Example \ref{['ex:sparse_figure']}. The entries of $X^*$ appear as gray dots while the entries of $\sqrt{p}\,Q_{X^*}$ appear as black circles.
  • Figure 3: A comparison of the columns of a single realization $X^*$ of $X$ with those of $\sqrt{p} Q_{X^*}$ when the entries $Z$ are i.i.d. standard normals and $\Omega$ is constructed from the Matérn correlation function, as discussed in Example \ref{['ex:matern']}. The entries of $X^*$ appear as gray dots while the entries of $\sqrt{p}\,Q_{X^*}$ appear as black circles.
  • Figure 4: Panel (a) shows the marginal posterior distribution of the sparsity parameter $\ell \in (0,1).$ A smaller value of $\ell$ leads to greater sparsity. Panel (b) compares the entries of the point estimate of $Q \Lambda Q^T$ under the two priors for $Q.$
  • Figure 5: The left side compares our point estimate (in black) to the results of classical PCA (in gray). The right side compares a histogram estimate of the posterior density of $\rho$ with its inverse gamma prior density.

Theorems & Definitions (13)

  • Example 1: Model-based singular value decomposition
  • Example 2: Network eigenmodel
  • Theorem 3.1: Invariance
  • Definition 3.2
  • Theorem 3.3: Wasserstein distance
  • Proposition 1
  • Proposition 2
  • Example 3
  • Example 4
  • Example 5
  • ...and 3 more