Learning to Embed Time Series Patches Independently
Seunghan Lee, Taeyoung Park, Kibok Lee
TL;DR
This paper investigates patch-based self-supervised learning for time series and argues that learning patch dependencies is not necessary for strong representations. It proposes Patch Independence for Time Series (PITS), which uses a patch reconstruction pretraining task and a patch-wise MLP encoder to embed patches independently, plus complementary contrastive learning to capture adjacent temporal information. The approach achieves state-of-the-art performance on forecasting and classification across diverse datasets while being more parameter-efficient and faster to train/infer than Transformer-based methods. Key findings include PI's robustness to distribution shifts and patch-size variations, the interpretability of PI representations, and the practical benefits of combining patch reconstruction with hierarchical CL. Overall, PITS provides a strong, efficient baseline for SSL in time-series analysis with potential applicability to other domains.
Abstract
Masked time series modeling has recently gained much attention as a self-supervised representation learning strategy for time series. Inspired by masked image modeling in computer vision, recent works first patchify and partially mask out time series, and then train Transformers to capture the dependencies between patches by predicting masked patches from unmasked patches. However, we argue that capturing such patch dependencies might not be an optimal strategy for time series representation learning; rather, learning to embed patches independently results in better time series representations. Specifically, we propose to use 1) the simple patch reconstruction task, which autoencode each patch without looking at other patches, and 2) the simple patch-wise MLP that embeds each patch independently. In addition, we introduce complementary contrastive learning to hierarchically capture adjacent time series information efficiently. Our proposed method improves time series forecasting and classification performance compared to state-of-the-art Transformer-based models, while it is more efficient in terms of the number of parameters and training/inference time. Code is available at this repository: https://github.com/seunghan96/pits.
