LogoRA: Local-Global Representation Alignment for Robust Time Series Classification

Huanyu Zhang; Yi-Fan Zhang; Zhang Zhang; Qingsong Wen; Liang Wang

LogoRA: Local-Global Representation Alignment for Robust Time Series Classification

Huanyu Zhang, Yi-Fan Zhang, Zhang Zhang, Qingsong Wen, Liang Wang

TL;DR

This work addresses unsupervised domain adaptation for time-series classification by learning both global and local representations and aligning them across domains. It introduces LogoRA, a two-branch encoder (global patch-based Transformer and multi-scale local CNN) with a Local-Global Fusion Module and a suite of alignment losses, including DTW-based invariant learning, triplet-margin, domain-adversarial, and per-class prototype center losses. Empirical results on HHAR, WISDM, HAR, and Sleep-EDF show LogoRA consistently outperforming strong baselines, with notable gains such as +12.52% on HHAR and +10.21% on WISDM and an overall average improvement around +6.40%. The approach demonstrates that jointly modeling global context and multi-scale local patterns enhances domain-invariant feature learning for time-series UDA and offers promising avenues for future multimodal extensions.

Abstract

Unsupervised domain adaptation (UDA) of time series aims to teach models to identify consistent patterns across various temporal scenarios, disregarding domain-specific differences, which can maintain their predictive accuracy and effectively adapt to new domains. However, existing UDA methods struggle to adequately extract and align both global and local features in time series data. To address this issue, we propose the Local-Global Representation Alignment framework (LogoRA), which employs a two-branch encoder, comprising a multi-scale convolutional branch and a patching transformer branch. The encoder enables the extraction of both local and global representations from time series. A fusion module is then introduced to integrate these representations, enhancing domain-invariant feature alignment from multi-scale perspectives. To achieve effective alignment, LogoRA employs strategies like invariant feature learning on the source domain, utilizing triplet loss for fine alignment and dynamic time warping-based feature alignment. Additionally, it reduces source-target domain gaps through adversarial training and per-class prototype alignment. Our evaluations on four time-series datasets demonstrate that LogoRA outperforms strong baselines by up to $12.52\%$, showcasing its superiority in time series UDA tasks.

LogoRA: Local-Global Representation Alignment for Robust Time Series Classification

TL;DR

Abstract

, showcasing its superiority in time series UDA tasks.

Paper Structure (19 sections, 9 equations, 10 figures, 6 tables, 1 algorithm)

This paper contains 19 sections, 9 equations, 10 figures, 6 tables, 1 algorithm.

Introduction
Related Work
Methodology
Problem Definition and Overall Architecture
Feature Extractor
Local-Global Fusion Module
Invariant Feature Learning On Source Domain
Alignment Across Domain Representations
Training
Experiments
Experimental Setup
Numerical Results on UDA Benchmarks
Ablation Studies
Visualization
Inference Time and Parameters
...and 4 more sections

Figures (10)

Figure 1: Model performance comparison on various tasks. RI denotes the relative improvement compared to SOTA.
Figure 2: A motivation example, which contains accelerometer data pieces of walking upstairs (upper) and walking downstairs (lower) from HAR dataset.
Figure 3: Model Architecture and Training Pipeline of LogoRA. The time series data is processed through a feature extractor, comprising a Global Encoder and a Multi-Scale Local Encoder, to extract local and global representations. Next, these representations are fed into the Fusion Module to obtain the fused representations. We further use different representations for invariant feature learning ($\mathcal{L}_{dtw}$ and $\mathcal{L}_{margin}$) and alignment across the source and target domain ($\mathcal{L}_{domain}$ and $\mathcal{L}_{center}$).
Figure 4: (a) Global Encoder: We use a Transformer encoder with a patching operation to obtain fine-grained global representations. (b) Multi-Scale Local Encoder: We use ConvNet with different kernel sizes to acquire multi-scale local representations.
Figure 5: Local-Global Fusion Module. We use cross-attention to fuse global and multi-scale local representations. Next, we concatenate all the cross-attentions and sum them to get the final output.
...and 5 more figures

LogoRA: Local-Global Representation Alignment for Robust Time Series Classification

TL;DR

Abstract

LogoRA: Local-Global Representation Alignment for Robust Time Series Classification

Authors

TL;DR

Abstract

Table of Contents

Figures (10)