Constrained non-negative matrix factorization enabling real-time insights of $\textit{in situ}$ and high-throughput experiments
Phillip M. Maffettone, Aidan C. Daly, Daniel Olds
TL;DR
The paper addresses the challenge of real-time interpretation of streaming diffraction data where canonical NMF can yield nonphysical components. It introduces constrained non-negative matrix factorization with user or algorithmic priors, solved via alternating non-negative least squares and implemented in PyTorch, to produce physically meaningful weights and components during in situ analyses. Demonstrations on synthetic datasets and on variable-temperature $\$BaTiO_3$ and molten-salt $NaCl:CrCl_3$ data show that constraints yield interpretable phase evolution and enable adaptive experimental decisions. This approach enables rapid, human-in-the-loop insights at beamlines and is extensible to other spectral modalities, providing a practical path toward real-time discovery in high-throughput experiments.
Abstract
Non-negative Matrix Factorization (NMF) methods offer an appealing unsupervised learning method for real-time analysis of streaming spectral data in time-sensitive data collection, such as $\textit{in situ}$ characterization of materials. However, canonical NMF methods are optimized to reconstruct a full dataset as closely as possible, with no underlying requirement that the reconstruction produces components or weights representative of the true physical processes. In this work, we demonstrate how constraining NMF weights or components, provided as known or assumed priors, can provide significant improvement in revealing true underlying phenomena. We present a PyTorch based method for efficiently applying constrained NMF and demonstrate this on several synthetic examples. When applied to streaming experimentally measured spectral data, an expert researcher-in-the-loop can provide and dynamically adjust the constraints. This set of interactive priors to the NMF model can, for example, contain known or identified independent components, as well as functional expectations about the mixing of components. We demonstrate this application on measured X-ray diffraction and pair distribution function data from $\textit{in situ}$ beamline experiments. Details of the method are described, and general guidance provided to employ constrained NMF in extraction of critical information and insights during $\textit{in situ}$ and high-throughput experiments.
