Table of Contents
Fetching ...

MultiDiffNet: A Multi-Objective Diffusion Framework for Generalizable Brain Decoding

Mengchun Zhang, Kateryna Shapovalenko, Yucheng Shao, Eddie Guo, Parusha Pradhan

TL;DR

MultiDiffNet tackles the core challenge of generalizing EEG decoding across unseen subjects by learning a compact, shared latent space $z$ through a jointly trained conditional DDPM, a discriminative encoder, and a generative decoder. The framework avoids synthetic augmentation by leveraging multi-objective optimization (classification, reconstruction, and contrastive learning) and augments representation quality with temporal mixup strategies. A unified four-task benchmark and a trend-level statistical reporting protocol address reproducibility concerns in low-trial, high-variance EEG research. Empirical results demonstrate improved cross-subject generalization across SSVEP, P300, Motor Imagery, and Imagined Speech, with ablations clarifying the importance of decoding from $z$, lightweight classifiers on $z$, and careful mixup design. The approach offers a reproducible, open-source foundation for subject-agnostic EEG decoding in real-world BCI systems.

Abstract

Neural decoding from electroencephalography (EEG) remains fundamentally limited by poor generalization to unseen subjects, driven by high inter-subject variability and the lack of large-scale datasets to model it effectively. Existing methods often rely on synthetic subject generation or simplistic data augmentation, but these strategies fail to scale or generalize reliably. We introduce \textit{MultiDiffNet}, a diffusion-based framework that bypasses generative augmentation entirely by learning a compact latent space optimized for multiple objectives. We decode directly from this space and achieve state-of-the-art generalization across various neural decoding tasks using subject and session disjoint evaluation. We also curate and release a unified benchmark suite spanning four EEG decoding tasks of increasing complexity (SSVEP, Motor Imagery, P300, and Imagined Speech) and an evaluation protocol that addresses inconsistent split practices in prior EEG research. Finally, we develop a statistical reporting framework tailored for low-trial EEG settings. Our work provides a reproducible and open-source foundation for subject-agnostic EEG decoding in real-world BCI systems.

MultiDiffNet: A Multi-Objective Diffusion Framework for Generalizable Brain Decoding

TL;DR

MultiDiffNet tackles the core challenge of generalizing EEG decoding across unseen subjects by learning a compact, shared latent space through a jointly trained conditional DDPM, a discriminative encoder, and a generative decoder. The framework avoids synthetic augmentation by leveraging multi-objective optimization (classification, reconstruction, and contrastive learning) and augments representation quality with temporal mixup strategies. A unified four-task benchmark and a trend-level statistical reporting protocol address reproducibility concerns in low-trial, high-variance EEG research. Empirical results demonstrate improved cross-subject generalization across SSVEP, P300, Motor Imagery, and Imagined Speech, with ablations clarifying the importance of decoding from , lightweight classifiers on , and careful mixup design. The approach offers a reproducible, open-source foundation for subject-agnostic EEG decoding in real-world BCI systems.

Abstract

Neural decoding from electroencephalography (EEG) remains fundamentally limited by poor generalization to unseen subjects, driven by high inter-subject variability and the lack of large-scale datasets to model it effectively. Existing methods often rely on synthetic subject generation or simplistic data augmentation, but these strategies fail to scale or generalize reliably. We introduce \textit{MultiDiffNet}, a diffusion-based framework that bypasses generative augmentation entirely by learning a compact latent space optimized for multiple objectives. We decode directly from this space and achieve state-of-the-art generalization across various neural decoding tasks using subject and session disjoint evaluation. We also curate and release a unified benchmark suite spanning four EEG decoding tasks of increasing complexity (SSVEP, Motor Imagery, P300, and Imagined Speech) and an evaluation protocol that addresses inconsistent split practices in prior EEG research. Finally, we develop a statistical reporting framework tailored for low-trial EEG settings. Our work provides a reproducible and open-source foundation for subject-agnostic EEG decoding in real-world BCI systems.

Paper Structure

This paper contains 34 sections, 12 equations, 3 figures, 56 tables, 1 algorithm.

Figures (3)

  • Figure 1: Overview of the MultiDiffNet that jointly optimizes a conditional DDPM, a contrastive encoder, and a generative decoder through a shared latent space z. The encoder produces discriminative features used for both classification and contrastive learning, while the decoder and DDPM reconstruct the input signal. An optional temporal masked mixup module stochastically blends the original, DDPM-denoised, and decoder-reconstructed EEG to improve representation quality.
  • Figure 2: Overview of four EEG datasets ranked by task difficulty from easiest (top) to hardest (bottom). Task paradigms and preprocessing details are adapted from the original publications: SSVEP wang2017benchmark, P300 korczowski2019brain, Motor Imagery tangermann2012review, and Imagined Speech zhao2015classifying.
  • Figure 3: (A) Visualization of latent space across training epochs. (B) Downstream classification performance from frozen latent representations.