STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

Maximilian Forstenhäusler; Daniel Külzer; Christos Anagnostopoulos; Shameem Puthiya Parambath; Natascha Weber

STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

Maximilian Forstenhäusler, Daniel Külzer, Christos Anagnostopoulos, Shameem Puthiya Parambath, Natascha Weber

TL;DR

STaRFormer tackles the challenge of modeling non-stationary and irregularly sampled sequential data by introducing dynamic regional masking and a task-informed semi-supervised contrastive learning scheme. The method uses a Siamese encoder-Transformer architecture to learn robust latent representations that are aligned both batch-wise and class-wise, while jointly optimizing downstream tasks (classification, anomaly detection, and regression). Extensive experiments across 56 datasets and multiple downstream tasks demonstrate strong performance gains, especially under irregular sampling and non-stationarity, with notable robustness and latent-space separability. Limitations include training-time overhead from the masking and dual-view CL, but inference remains efficient and the framework shows broad applicability across time-series domains.

Abstract

Understanding user intent is essential for situational and context-aware decision-making. Motivated by a real-world scenario, this work addresses intent predictions of smart device users in the vicinity of vehicles by modeling sequential spatiotemporal data. However, in real-world scenarios, environmental factors and sensor limitations can result in non-stationary and irregularly sampled data, posing significant challenges. To address these issues, we propose STaRFormer, a Transformer-based approach that can serve as a universal framework for sequential modeling. STaRFormer utilizes a new dynamic attention-based regional masking scheme combined with a novel semi-supervised contrastive learning paradigm to enhance task-specific latent representations. Comprehensive experiments on 56 datasets varying in types (including non-stationary and irregularly sampled), tasks, domains, sequence lengths, training samples, and applications demonstrate the efficacy of STaRFormer, achieving notable improvements over state-of-the-art approaches.

STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

TL;DR

Abstract

STaRFormer: Semi-Supervised Task-Informed Representation Learning via Dynamic Attention-Based Regional Masking for Sequential Data

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)