Table of Contents
Fetching ...

The Alpha-Alternator: Dynamic Adaptation To Varying Noise Levels In Sequences Using The Vendi Score For Improved Robustness and Performance

Mohammad Reza Rezaei, Adji Bousso Dieng

TL;DR

The paper tackles the limitation of fixed-noise assumptions in sequential models by introducing the $\alpha$-Alternator, which uses the Vendi Score to adaptively weight current observations versus latent history at each time step. A dataset-wide scalar controls the direction of influence, enabling the model to treat high VS as either noise or informative input, and training employs random masking to simulate varying noise levels. Empirically, the approach outperforms Mambas and Alternators in neural decoding and time-series forecasting, with strong robustness to missing data and clear gains when combining both adaptive gating and masking. This work advances robust sequence modeling in noisy real-world data, with potential impact on neuroscience data analysis and diverse temporal forecasting tasks.

Abstract

Current state-of-the-art dynamical models, such as Mamba, assume the same level of noisiness for all elements of a given sequence, which limits their performance on noisy temporal data. In this paper, we introduce the $α$-Alternator, a novel generative model for time-dependent data that dynamically adapts to the complexity introduced by varying noise levels in sequences. The $α$-Alternator leverages the Vendi Score (VS), a flexible similarity-based diversity metric, to adjust, at each time step $t$, the influence of the sequence element at time $t$ and the latent representation of the dynamics up to that time step on the predicted future dynamics. This influence is captured by a parameter that is learned and shared across all sequences in a given dataset. The sign of this parameter determines the direction of influence. A negative value indicates a noisy dataset, where a sequence element that increases the VS is considered noisy, and the model relies more on the latent history when processing that element. Conversely, when the parameter is positive, a sequence element that increases the VS is considered informative, and the $α$-Alternator relies more on this new input than on the latent history when updating its predicted latent dynamics. The $α$-Alternator is trained using a combination of observation masking and Alternator loss minimization. Masking simulates varying noise levels in sequences, enabling the model to be more robust to these fluctuations and improving its performance in trajectory prediction, imputation, and forecasting. Our experimental results demonstrate that the $α$-Alternator outperforms both Alternators and state-of-the-art state-space models across neural decoding and time-series forecasting benchmarks.

The Alpha-Alternator: Dynamic Adaptation To Varying Noise Levels In Sequences Using The Vendi Score For Improved Robustness and Performance

TL;DR

The paper tackles the limitation of fixed-noise assumptions in sequential models by introducing the -Alternator, which uses the Vendi Score to adaptively weight current observations versus latent history at each time step. A dataset-wide scalar controls the direction of influence, enabling the model to treat high VS as either noise or informative input, and training employs random masking to simulate varying noise levels. Empirically, the approach outperforms Mambas and Alternators in neural decoding and time-series forecasting, with strong robustness to missing data and clear gains when combining both adaptive gating and masking. This work advances robust sequence modeling in noisy real-world data, with potential impact on neuroscience data analysis and diverse temporal forecasting tasks.

Abstract

Current state-of-the-art dynamical models, such as Mamba, assume the same level of noisiness for all elements of a given sequence, which limits their performance on noisy temporal data. In this paper, we introduce the -Alternator, a novel generative model for time-dependent data that dynamically adapts to the complexity introduced by varying noise levels in sequences. The -Alternator leverages the Vendi Score (VS), a flexible similarity-based diversity metric, to adjust, at each time step , the influence of the sequence element at time and the latent representation of the dynamics up to that time step on the predicted future dynamics. This influence is captured by a parameter that is learned and shared across all sequences in a given dataset. The sign of this parameter determines the direction of influence. A negative value indicates a noisy dataset, where a sequence element that increases the VS is considered noisy, and the model relies more on the latent history when processing that element. Conversely, when the parameter is positive, a sequence element that increases the VS is considered informative, and the -Alternator relies more on this new input than on the latent history when updating its predicted latent dynamics. The -Alternator is trained using a combination of observation masking and Alternator loss minimization. Masking simulates varying noise levels in sequences, enabling the model to be more robust to these fluctuations and improving its performance in trajectory prediction, imputation, and forecasting. Our experimental results demonstrate that the -Alternator outperforms both Alternators and state-of-the-art state-space models across neural decoding and time-series forecasting benchmarks.

Paper Structure

This paper contains 10 sections, 12 equations, 4 figures, 2 tables, 2 algorithms.

Figures (4)

  • Figure 1: The $\alpha$-Alternator is robust to varying noise levels compared to a Mamba and an Alternator. The Alternator is more robust to noise than the Mamba.
  • Figure 2: The $\alpha$-Alternator outperforms other models on trajectory prediction in the neural decoding task on all three datasets in terms of MSE and CC. In terms of MAE, the $\alpha$-Alternator outperforms the baselines on all datasets except the Hippocampus dataset, which has lower temporal diversity as shown in Figure \ref{['fig:mae-vs-comparision']}.
  • Figure 3: VS over time for the Motor Cortex, Hippocampus, and Somatosensory Cortex datasets. Lower VS values in the Hippocampus indicate less diverse observations across time steps, leading to a diminished effect of the adaptive mechanism in the $\alpha$-Alternator compared to the Mamba. In contrast, for the Motor Cortex and the Somatosensory datasets, the $\alpha$-Alternator effectively leverages VS-based adaptation, outperforming the Mamba in handling varying noise levels.
  • Figure 4: Comparison of performance on neural imputation across different brain regions. The $\alpha$-Alternator consistently outperforms the baselines in imputing missing values across Motor Cortex, Somatosensory, and Hippocampus datasets. Results are averaged across missing value rates ranging from 10% to 95%, with performance measured using MAE, MSE, and CC. Vertical bars indicate standard errors across different missing value rates. The $\alpha$-Alternator achieves notably lower errors and higher CCs across all three neural regions, with particularly strong performance in the complex Hippocampus dataset.