SlotFlow: Amortized Trans-Dimensional Inference with Slot-Based Normalizing Flows
Niklas Houba, Giovanni Giarda, Lorenzo Speri
TL;DR
SlotFlow tackles trans-dimensional Bayesian inference where the number of components $K$ is unknown. It uses a dual-stream encoder to fuse time and frequency information, a dynamic slot allocator that instantiates exactly $\hat{K}$ slots, and shared conditional normalizing flows with Hungarian matching to produce per-slot posteriors, achieving $O(K)$ computation and millisecond latency. The approach yields 99.85% cardinality accuracy and well-calibrated parameter posteriors on crowded sinusoidal mixtures, while matching amplitude and phase closely to RJMCMC and exposing two–threefold broader frequency posteriors due to encoder bottlenecks. Compared with RJMCMC, SlotFlow offers ~$10^6\times$ speedups, enabling real-time or interactive analysis in gravitational-wave astronomy, neural spike sorting, and object-centric vision. Limitations include the factorized posterior approximation and encoder-induced frequency precision loss; future work targets multi-scale encoders, time–frequency representations, and explicit inter-slot dependencies to further improve frequency resolution without sacrificing efficiency.
Abstract
Inferring the number of distinct components contributing to an observation, while simultaneously estimating their parameters, remains a long-standing challenge across signal processing, astrophysics, and neuroscience. Classical trans-dimensional Bayesian methods such as Reversible Jump Markov Chain Monte Carlo (RJMCMC) provide asymptotically exact inference but can be computationally expensive. Instead, modern deep learning provides a faster alternative to inference but typically assume fixed component counts, sidestepping the core challenge of trans-dimensionality. To address this, we introduce SlotFlow, a deep learning architecture for trans-dimensional amortized inference. The architecture processes time-series observations, which we represent jointly in the frequency and time domains through parallel encoders. A classifier produces a distribution over component counts K, and its MAP estimate specifies the number of slots instantiated. Each slot is parameterized by a shared conditional normalizing flow trained via permutation-invariant Hungarian matching. On sinusoidal decomposition with up to 10 overlapping components and Gaussian noise, SlotFlow achieves 99.85% cardinality accuracy and well-calibrated parameter posteriors, with systematic biases well below one posterior standard deviation. Direct comparison with RJMCMC shows close agreement in amplitude and phase, with Wasserstein distances $W_2 < 0.01$ and $< 0.03$, indicating that shared global context captures inter-component structure despite a factorized posterior. Frequency posteriors remain centered but exhibit 2-3x broader intervals, consistent with an encoder bottleneck in retaining long-baseline phase coherence. The method delivers a $\sim 10^6\times$ speedup over RJMCMC, suggesting applicability to time-critical workflows in gravitational-wave astronomy, neural spike sorting, and object-centric vision.
