Table of Contents
Fetching ...

EB-gMCR: Energy-Based Generative Modeling for Signal Unmixing and Multivariate Curve Resolution

Yu-Tang Chang, Shih-Fang Chen

TL;DR

EB-gMCR reframes multivariate curve resolution as a generative unmixing problem and solves the inverse by an energy-based deep solver. The approach automatically discovers the minimal active component set from large candidate pools via an explicit selection gate, while accommodating domain priors as plug-ins. It demonstrates high reconstruction fidelity with few active components on synthetic data up to N=256 and achieves competitive decomposability on real spectral datasets (Carbs, NIR) compared to MF-based baselines. The method is designed for scalability, reusability across samples from the same generative process, and flexible integration of chemical constraints, offering a practical framework for high-throughput, fixed-pattern signal unmixing.

Abstract

Signal unmixing analysis decomposes data into basic patterns and is widely applied in chemical and biological research. Multivariate curve resolution (MCR), a branch of signal unmixing, separates mixed signals into components (base patterns) and their concentrations (intensity), playing a key role in understanding composition. Classical MCR is typically framed as matrix factorization (MF) and requires a user-specified number of components, usually unknown in real data. Once data or component number increases, the scalability of these MCR approaches face significant challenges. This study reformulates MCR as a data generative process (gMCR), and introduces an Energy-Based solver, EB-gMCR, that automatically discovers the smallest component set and their concentrations for reconstructing the mixed signals faithfully. On synthetic benchmarks with up to 256 components, EB-gMCR attains high reconstruction fidelity and recovers the component count within 5% at 20dB noise and near-exact at 30dB. On two public spectral datasets, it identifies the correct component count and improves component separation over MF-based MCR approaches (NMF variants, ICA, MCR-ALS). EB-gMCR is a general solver for fixed-pattern signal unmixing (components remain invariant across mixtures). Domain priors (non-negativity, nonlinear mixing) enter as plug-in modules, enabling adaptation to new instruments or domains without altering the core selection learning step. The source code is available at https://github.com/b05611038/ebgmcr_solver.

EB-gMCR: Energy-Based Generative Modeling for Signal Unmixing and Multivariate Curve Resolution

TL;DR

EB-gMCR reframes multivariate curve resolution as a generative unmixing problem and solves the inverse by an energy-based deep solver. The approach automatically discovers the minimal active component set from large candidate pools via an explicit selection gate, while accommodating domain priors as plug-ins. It demonstrates high reconstruction fidelity with few active components on synthetic data up to N=256 and achieves competitive decomposability on real spectral datasets (Carbs, NIR) compared to MF-based baselines. The method is designed for scalability, reusability across samples from the same generative process, and flexible integration of chemical constraints, offering a practical framework for high-throughput, fixed-pattern signal unmixing.

Abstract

Signal unmixing analysis decomposes data into basic patterns and is widely applied in chemical and biological research. Multivariate curve resolution (MCR), a branch of signal unmixing, separates mixed signals into components (base patterns) and their concentrations (intensity), playing a key role in understanding composition. Classical MCR is typically framed as matrix factorization (MF) and requires a user-specified number of components, usually unknown in real data. Once data or component number increases, the scalability of these MCR approaches face significant challenges. This study reformulates MCR as a data generative process (gMCR), and introduces an Energy-Based solver, EB-gMCR, that automatically discovers the smallest component set and their concentrations for reconstructing the mixed signals faithfully. On synthetic benchmarks with up to 256 components, EB-gMCR attains high reconstruction fidelity and recovers the component count within 5% at 20dB noise and near-exact at 30dB. On two public spectral datasets, it identifies the correct component count and improves component separation over MF-based MCR approaches (NMF variants, ICA, MCR-ALS). EB-gMCR is a general solver for fixed-pattern signal unmixing (components remain invariant across mixtures). Domain priors (non-negativity, nonlinear mixing) enter as plug-in modules, enabling adaptation to new instruments or domains without altering the core selection learning step. The source code is available at https://github.com/b05611038/ebgmcr_solver.

Paper Structure

This paper contains 44 sections, 6 theorems, 48 equations, 3 figures, 5 tables, 3 algorithms.

Key Result

Corollary B.1

Assume B1–B5. Let $[t_1, t_2]$ be a window with $\lambda_t \geq \lambda^{\ast}$ and $\tau_t \leq \bar{\tau}$ for all $t \in [t_1, t_2]$, and suppose Then, with probability at least $1 - \alpha$, there exists $t^{\ast} \in [t_1, t_2]$ such that $S_t = S^{\ast} \quad$ for all $t \geq t^{\ast}$. We call $t < t^{\ast}$Phase A (support selection) and $t \geq t^{\ast}$Phase B (optimization on the fixed

Figures (3)

  • Figure 1: Overview. (a) gMCR graphical model. Only the black node $\mathbf{D}$ (data) is observed. (b) EB-select. An energy-based adaptive gate that infers the component selection ($\delta$).
  • Figure 2: Synthetic benchmarks. (a–d) EB-gMCR checkpoints: estimated vs. true component number (dashed black line); mean (solid) and ±1 SD (shaded) over 5 replicates; colors denote $R^2$ checkpoint bands. Panels: (a) $4N$, $20$dB; (b) $4N$, $30$dB; (c) $8N$, $20$dB; (d) $8N$, $30$dB (e, g) EB-gMCR vs. baselines: estimated vs. true components at 4N under 20 dB and 30 dB. (f, h) Reconstruction $R^2$ at each method’s EC for the same settings.
  • Figure 3: Real-data reconstruction. $R^2$ vs component number: (a) Carbs ($N=3$), (b) NIR ($N=2$).

Theorems & Definitions (12)

  • Corollary B.1: Phase separation under an exact-penalty window
  • proof : Proof sketch
  • Lemma B.2: Uniform concentration of energy gaps over a finite window; uses B1--B3)
  • proof : Proof
  • Lemma B.3: Robustness to the statewise quadratic, uses B2--B3
  • proof : Proof sketch
  • Lemma B.4: Gating stabilization horizon in Phase A; uses B1–B4
  • proof : Proof sketch
  • Lemma B.5: PL descent under bounded perturbations in Phase B; uses B6–B7
  • proof : Proof sketch
  • ...and 2 more