LaFA: Latent Feature Attacks on Non-negative Matrix Factorization
Minh Vu, Ben Nebgen, Erik Skau, Geigh Zollicoffer, Juan Castorena, Kim Rasmussen, Boian Alexandrov, Manish Bhattarai
TL;DR
The paper targets the robustness of non-negative matrix factorization by introducing Latent Feature Attacks (LaFA) that optimize perturbations to maximize discrepancies in latent features, measured by a novel Feature Error (FE) loss defined on aligned latent representations. It presents two gradient-based attack strategies: Back-propagation, which unrolls NMF updates, and Implicit Differentiation, which uses the fixed-point condition to compute gradients with reduced memory overhead. FE loss relies on permutation alignment via the Hungarian algorithm to compare the latent feature matrices, enabling targeted attacks on the factors rather than the reconstruction error. Across synthetic data and four real datasets (WTSI, Face, Swimmer, MNIST), LaFA demonstrates substantial distortions in latent features with small perturbations, revealing vulnerabilities not captured by reconstruction metrics, with the implicit method offering a scalable, memory-efficient attack route. This work advances understanding of unsupervised model robustness and suggests pathways to fortify NMF against latent-feature manipulation in practical settings.
Abstract
As Machine Learning (ML) applications rapidly grow, concerns about adversarial attacks compromising their reliability have gained significant attention. One unsupervised ML method known for its resilience to such attacks is Non-negative Matrix Factorization (NMF), an algorithm that decomposes input data into lower-dimensional latent features. However, the introduction of powerful computational tools such as Pytorch enables the computation of gradients of the latent features with respect to the original data, raising concerns about NMF's reliability. Interestingly, naively deriving the adversarial loss for NMF as in the case of ML would result in the reconstruction loss, which can be shown theoretically to be an ineffective attacking objective. In this work, we introduce a novel class of attacks in NMF termed Latent Feature Attacks (LaFA), which aim to manipulate the latent features produced by the NMF process. Our method utilizes the Feature Error (FE) loss directly on the latent features. By employing FE loss, we generate perturbations in the original data that significantly affect the extracted latent features, revealing vulnerabilities akin to those found in other ML techniques. To handle large peak-memory overhead from gradient back-propagation in FE attacks, we develop a method based on implicit differentiation which enables their scaling to larger datasets. We validate NMF vulnerabilities and FE attacks effectiveness through extensive experiments on synthetic and real-world data.
