Normalizing self-supervised learning for provably reliable Change Point Detection
Alexandra Bazarova, Evgenia Romanenkova, Alexey Zaytsev
TL;DR
This work tackles unsupervised change point detection by unifying self-supervised time-series representation learning with spectral normalization to guarantee reliability. The authors prove that spectral normalization preserves test power for both kernel-based and likelihood-ratio CPD tests, leveraging bi-Lipschitz and kernel-distance preservation properties. They instantiate the framework with two SSL backbones, TS2Vec and TS-BYOL, and validate improvements on Yahoo! A4 Benchmark, USC-HAD, and HASC datasets, achieving competitive or leading F1 scores. The approach offers a theoretically grounded and practically effective path to robust CPD in high-dimensional time series, with notable gains in detection robustness and sensitivity. Overall, this work bridges representation learning and classical CPD theory to deliver more reliable, scalable CPD in real-world streams.
Abstract
Change point detection (CPD) methods aim to identify abrupt shifts in the distribution of input data streams. Accurate estimators for this task are crucial across various real-world scenarios. Yet, traditional unsupervised CPD techniques face significant limitations, often relying on strong assumptions or suffering from low expressive power due to inherent model simplicity. In contrast, representation learning methods overcome these drawbacks by offering flexibility and the ability to capture the full complexity of the data without imposing restrictive assumptions. However, these approaches are still emerging in the CPD field and lack robust theoretical foundations to ensure their reliability. Our work addresses this gap by integrating the expressive power of representation learning with the groundedness of traditional CPD techniques. We adopt spectral normalization (SN) for deep representation learning in CPD tasks and prove that the embeddings after SN are highly informative for CPD. Our method significantly outperforms current state-of-the-art methods during the comprehensive evaluation via three standard CPD datasets.
