Stratified Non-Negative Tensor Factorization
Alexander Sietsema, Zerrin Vural, James Chapman, Yotam Yaniv, Deanna Needell
TL;DR
This work extends Stratified-NMF to the tensor setting by introducing Stratified-NTF, a non-negative tensor factorization framework that preserves multi-mode geometry while jointly learning global topics shared across strata and strata-dependent shifts. It derives multiplicative update rules for the strata and global factors and adds a TV regularized variant for image denoising, enabling efficient learning with potentially lower memory usage than flattened approaches. Empirical results on text and image datasets demonstrate interpretable, strata-aware topics, rapid convergence, and denoising benefits, with clear advantages over flattening-based methods and competitive performance with fewer parameters. The approach provides a principled way to analyze heterogeneous, multi-modal data with preserved structure and enhanced interpretability, with public code available for broader adoption.
Abstract
Non-negative matrix factorization (NMF) and non-negative tensor factorization (NTF) decompose non-negative high-dimensional data into non-negative low-rank components. NMF and NTF methods are popular for their intrinsic interpretability and effectiveness on large-scale data. Recent work developed Stratified-NMF, which applies NMF to regimes where data may come from different sources (strata) with different underlying distributions, and seeks to recover both strata-dependent information and global topics shared across strata. Applying Stratified-NMF to multi-modal data requires flattening across modes, and therefore loses geometric structure contained implicitly within the tensor. To address this problem, we extend Stratified-NMF to the tensor setting by developing a multiplicative update rule and demonstrating the method on text and image data. We find that Stratified-NTF can identify interpretable topics with lower memory requirements than Stratified-NMF. We also introduce a regularized version of the method and demonstrate its effects on image data.
