Table of Contents
Fetching ...

WaveFormer: Wavelet Embedding Transformer for Biomedical Signals

Habib Irani, Bikram De, Vangelis Metsis

TL;DR

Biomedical time-series classification is challenged by long sequences, non-stationary dynamics, and multi-scale frequency content. WaveFormer introduces a dual-stage wavelet integration into a patch-based transformer: wavelet-enhanced embedding using Path A and Path B, plus Dynamic Wavelet Positional Encoding (DyWPE) and a Transformer encoder with relative positional bias. It achieves competitive or state-of-the-art results across eight diverse datasets, with seven wins and notable gains on long sequences, validated by ablations showing the dominant contribution of explicit frequency decomposition. The approach offers a principled framework to embed frequency-domain knowledge into transformer-based time-series classifiers, facilitating improved performance on long, multi-channel biomedical signals.

Abstract

Biomedical signal classification presents unique challenges due to long sequences, complex temporal dynamics, and multi-scale frequency patterns that are poorly captured by standard transformer architectures. We propose WaveFormer, a transformer architecture that integrates wavelet decomposition at two critical stages: embedding construction, where multi-channel Discrete Wavelet Transform (DWT) extracts frequency features to create tokens containing both time-domain and frequency-domain information, and positional encoding, where Dynamic Wavelet Positional Encoding (DyWPE) adapts position embeddings to signal-specific temporal structure through mono-channel DWT analysis. We evaluate WaveFormer on eight diverse datasets spanning human activity recognition and brain signal analysis, with sequence lengths ranging from 50 to 3000 timesteps and channel counts from 1 to 144. Experimental results demonstrate that WaveFormer achieves competitive performance through comprehensive frequency-aware processing. Our approach provides a principled framework for incorporating frequency-domain knowledge into transformer-based time series classification.

WaveFormer: Wavelet Embedding Transformer for Biomedical Signals

TL;DR

Biomedical time-series classification is challenged by long sequences, non-stationary dynamics, and multi-scale frequency content. WaveFormer introduces a dual-stage wavelet integration into a patch-based transformer: wavelet-enhanced embedding using Path A and Path B, plus Dynamic Wavelet Positional Encoding (DyWPE) and a Transformer encoder with relative positional bias. It achieves competitive or state-of-the-art results across eight diverse datasets, with seven wins and notable gains on long sequences, validated by ablations showing the dominant contribution of explicit frequency decomposition. The approach offers a principled framework to embed frequency-domain knowledge into transformer-based time-series classifiers, facilitating improved performance on long, multi-channel biomedical signals.

Abstract

Biomedical signal classification presents unique challenges due to long sequences, complex temporal dynamics, and multi-scale frequency patterns that are poorly captured by standard transformer architectures. We propose WaveFormer, a transformer architecture that integrates wavelet decomposition at two critical stages: embedding construction, where multi-channel Discrete Wavelet Transform (DWT) extracts frequency features to create tokens containing both time-domain and frequency-domain information, and positional encoding, where Dynamic Wavelet Positional Encoding (DyWPE) adapts position embeddings to signal-specific temporal structure through mono-channel DWT analysis. We evaluate WaveFormer on eight diverse datasets spanning human activity recognition and brain signal analysis, with sequence lengths ranging from 50 to 3000 timesteps and channel counts from 1 to 144. Experimental results demonstrate that WaveFormer achieves competitive performance through comprehensive frequency-aware processing. Our approach provides a principled framework for incorporating frequency-domain knowledge into transformer-based time series classification.
Paper Structure (17 sections, 13 equations, 4 figures, 2 tables)

This paper contains 17 sections, 13 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Example of multi-scale wavelet decomposition of an EEG signal (SelfRegulationSCP1). Different frequency bands capture distinct physiological patterns: approximation coefficients encode slow trends while detail coefficients isolate specific neural rhythms.
  • Figure 2: Overall architecture of the WaveFormer model.
  • Figure 3: Three approaches to compute self-attention. (a) Clipping RPE: relative distances beyond a threshold are clipped. (b) T5-based bucketing RPE (adopted in WaveFormer): uses logarithmic bucketing for parameter-efficient distance modeling. (c) Wavelet-based attention downsampling.
  • Figure 4: WaveFormer performance analysis.