Table of Contents
Fetching ...

Non-Negative Stiefel Approximating Flow: Orthogonalish Matrix Optimization for Interpretable Embeddings

Brian B. Avants, Nicholas J. Tustison, James R Stone

TL;DR

NSA-Flow introduces a non-negative matrix approximation framework that softly enforces near-orthogonality via a differentiable flow near the Stiefel manifold, controlled by a single weight parameter $w$. By unifying sparse factorization, orthogonalization, and proximal updates, NSA-Flow delivers interpretable, disjoint latent components while maintaining reconstruction fidelity. The approach demonstrates stable convergence, scalable computation, and improved interpretability across simulations and real biomedical datasets, including genomics and neuroimaging, with notable gains in disease subtype prediction and cognitive outcome modeling. Its flexible integration into existing pipelines and a publicly available PyTorch package support broad applicability in high-dimensional data analysis. Practically, NSA-Flow offers a principled mechanism to trade sparsity and decorrelation at the global embedding level, enabling more actionable, biomarker-ready representations in complex data domains.

Abstract

Interpretable representation learning is a central challenge in modern machine learning, particularly in high-dimensional settings such as neuroimaging, genomics, and text analysis. Current methods often struggle to balance the competing demands of interpretability and model flexibility, limiting their effectiveness in extracting meaningful insights from complex data. We introduce Non-negative Stiefel Approximating Flow (NSA-Flow), a general-purpose matrix estimation framework that unifies ideas from sparse matrix factorization, orthogonalization, and constrained manifold learning. NSA-Flow enforces structured sparsity through a continuous balance between reconstruction fidelity and column-wise decorrelation, parameterized by a single tunable weight. The method operates as a smooth flow near the Stiefel manifold with proximal updates for non-negativity and adaptive gradient control, yielding representations that are simultaneously sparse, stable, and interpretable. Unlike classical regularization schemes, NSA-Flow provides an intuitive geometric mechanism for manipulating sparsity at the level of global structure while simplifying latent features. We demonstrate that the NSA-Flow objective can be optimized smoothly and integrates seamlessly with existing pipelines for dimensionality reduction while improving interpretability and generalization in both simulated and real biomedical data. Empirical validation on the Golub leukemia dataset and in Alzheimer's disease demonstrate that the NSA-Flow constraints can maintain or improve performance over related methods with little additional methodological effort. NSA-Flow offers a scalable, general-purpose tool for interpretable ML, applicable across data science domains.

Non-Negative Stiefel Approximating Flow: Orthogonalish Matrix Optimization for Interpretable Embeddings

TL;DR

NSA-Flow introduces a non-negative matrix approximation framework that softly enforces near-orthogonality via a differentiable flow near the Stiefel manifold, controlled by a single weight parameter . By unifying sparse factorization, orthogonalization, and proximal updates, NSA-Flow delivers interpretable, disjoint latent components while maintaining reconstruction fidelity. The approach demonstrates stable convergence, scalable computation, and improved interpretability across simulations and real biomedical datasets, including genomics and neuroimaging, with notable gains in disease subtype prediction and cognitive outcome modeling. Its flexible integration into existing pipelines and a publicly available PyTorch package support broad applicability in high-dimensional data analysis. Practically, NSA-Flow offers a principled mechanism to trade sparsity and decorrelation at the global embedding level, enabling more actionable, biomarker-ready representations in complex data domains.

Abstract

Interpretable representation learning is a central challenge in modern machine learning, particularly in high-dimensional settings such as neuroimaging, genomics, and text analysis. Current methods often struggle to balance the competing demands of interpretability and model flexibility, limiting their effectiveness in extracting meaningful insights from complex data. We introduce Non-negative Stiefel Approximating Flow (NSA-Flow), a general-purpose matrix estimation framework that unifies ideas from sparse matrix factorization, orthogonalization, and constrained manifold learning. NSA-Flow enforces structured sparsity through a continuous balance between reconstruction fidelity and column-wise decorrelation, parameterized by a single tunable weight. The method operates as a smooth flow near the Stiefel manifold with proximal updates for non-negativity and adaptive gradient control, yielding representations that are simultaneously sparse, stable, and interpretable. Unlike classical regularization schemes, NSA-Flow provides an intuitive geometric mechanism for manipulating sparsity at the level of global structure while simplifying latent features. We demonstrate that the NSA-Flow objective can be optimized smoothly and integrates seamlessly with existing pipelines for dimensionality reduction while improving interpretability and generalization in both simulated and real biomedical data. Empirical validation on the Golub leukemia dataset and in Alzheimer's disease demonstrate that the NSA-Flow constraints can maintain or improve performance over related methods with little additional methodological effort. NSA-Flow offers a scalable, general-purpose tool for interpretable ML, applicable across data science domains.

Paper Structure

This paper contains 39 sections, 10 equations, 13 figures, 3 tables.

Figures (13)

  • Figure 1: Illustration of the NSA-Flow optimization as a function of $\omega$. The colored manifold is a conceptual representation of the Stiefel manifold, with the curves representing optimization paths for evolving $Y$. When $\omega$ is small, the retraction is mild, allowing more deviation from orthonormality; when $\omega$ is large, the retraction strongly enforces orthonormality, pulling $Y$ closer to the manifold.
  • Figure 2: NSA-Flow Algorithm Workflow
  • Figure 3: NSA-Flow applied to a toy 4x3 matrix with noisy orthogonal nonnegative patterns.
  • Figure 4: The impact of regularization on measures of both orthogonality and fidelity error across different optimizers. For both metrics, lower values are better. We vary ($\omega$) from 0.01 (minor orthogonality enforcement) to 0.99 (near full enforcement) and compare standard torch optimizers. Smoothly reducing curves for orthogonality and increasing curves for fidelity are expected and indicate better performance. This evaluation as well as comparisons in sparse PCA and factor analysis suggest that LARS and ASGD are most reliable in conjunction with the NSA flow implementation.
  • Figure 5: The default optimizer peformance across different values of orthogonality weight indicates that the single parameter provides predictable control of the orthogonality in the outcome. Higher values lead predictably to more decorrelation at the expense of fidelity.
  • ...and 8 more figures