Alternators For Sequence Modeling
Mohammad Reza Rezaei, Adji Bousso Dieng
TL;DR
Alternators address the challenge of modeling time-dependent data with complex, non-Markovian dynamics by coupling two neural networks, the observation trajectory network (OTN) and the feature trajectory network (FTN), that alternately generate observations and latent features. The two networks are trained jointly by minimizing a cross-entropy objective over the joint trajectory distributions, yielding informative low-dimensional latent dynamics and high-quality sequence predictions. Across Lorenz attractor dynamics, neural decoding from brain activity, and sea-surface temperature forecasting, alternators outperform several strong baselines in trajectory fidelity and predictive tasks while offering faster sampling. This framework provides a versatile, interpretable alternative to diffusion/score-based models and high-dimensional latent-variable models, with practical impact for scientific time-series modeling and data-imputation tasks.
Abstract
This paper introduces alternators, a novel family of non-Markovian dynamical models for sequences. An alternator features two neural networks: the observation trajectory network (OTN) and the feature trajectory network (FTN). The OTN and the FTN work in conjunction, alternating between outputting samples in the observation space and some feature space, respectively, over a cycle. The parameters of the OTN and the FTN are not time-dependent and are learned via a minimum cross-entropy criterion over the trajectories. Alternators are versatile. They can be used as dynamical latent-variable generative models or as sequence-to-sequence predictors. Alternators can uncover the latent dynamics underlying complex sequential data, accurately forecast and impute missing data, and sample new trajectories. We showcase the capabilities of alternators in three applications. We first used alternators to model the Lorenz equations, often used to describe chaotic behavior. We then applied alternators to Neuroscience, to map brain activity to physical activity. Finally, we applied alternators to Climate Science, focusing on sea-surface temperature forecasting. In all our experiments, we found alternators are stable to train, fast to sample from, yield high-quality generated samples and latent variables, and often outperform strong baselines such as Mambas, neural ODEs, and diffusion models in the domains we studied.
