Table of Contents
Fetching ...

LightSAE: Parameter-Efficient and Heterogeneity-Aware Embedding for IoT Multivariate Time Series Forecasting

Yi Ren, Xinjie Yu

TL;DR

The paper tackles channel heterogeneity in IoT Multivariate Time Series Forecasting by reframing the embedding layer as a channel-specific transformation. It introduces Shared-Auxiliary Embedding (SAE), which decomposes embeddings into a shared base and channel-specific auxiliaries, and discovers that auxiliary weights exhibit low-rank and clustering structures. To exploit these observations, LightSAE combines low-rank factorization with a shared pool and gating to achieve parameter-efficient, heterogeneity-aware embeddings. Across 9 IoT datasets and 4 backbone architectures, LightSAE delivers consistent improvements in MSE (up to 22.8% in one ablation) with only about a 4% parameter increase, validating its practical effectiveness and plug-and-play applicability for existing MTSF models.

Abstract

Modern Internet of Things (IoT) systems generate massive, heterogeneous multivariate time series data. Accurate Multivariate Time Series Forecasting (MTSF) of such data is critical for numerous applications. However, existing methods almost universally employ a shared embedding layer that processes all channels identically, creating a representational bottleneck that obscures valuable channel-specific information. To address this challenge, we introduce a Shared-Auxiliary Embedding (SAE) framework that decomposes the embedding into a shared base component capturing common patterns and channel-specific auxiliary components modeling unique deviations. Within this decomposition, we \rev{empirically observe} that the auxiliary components tend to exhibit low-rank and clustering characteristics, a structural pattern that is significantly less apparent when using purely independent embeddings. Consequently, we design LightSAE, a parameter-efficient embedding module that operationalizes these observed characteristics through low-rank factorization and a shared, gated component pool. Extensive experiments across 9 IoT-related datasets and 4 backbone architectures demonstrate LightSAE's effectiveness, achieving MSE improvements of up to 22.8\% with only 4.0\% parameter increase.

LightSAE: Parameter-Efficient and Heterogeneity-Aware Embedding for IoT Multivariate Time Series Forecasting

TL;DR

The paper tackles channel heterogeneity in IoT Multivariate Time Series Forecasting by reframing the embedding layer as a channel-specific transformation. It introduces Shared-Auxiliary Embedding (SAE), which decomposes embeddings into a shared base and channel-specific auxiliaries, and discovers that auxiliary weights exhibit low-rank and clustering structures. To exploit these observations, LightSAE combines low-rank factorization with a shared pool and gating to achieve parameter-efficient, heterogeneity-aware embeddings. Across 9 IoT datasets and 4 backbone architectures, LightSAE delivers consistent improvements in MSE (up to 22.8% in one ablation) with only about a 4% parameter increase, validating its practical effectiveness and plug-and-play applicability for existing MTSF models.

Abstract

Modern Internet of Things (IoT) systems generate massive, heterogeneous multivariate time series data. Accurate Multivariate Time Series Forecasting (MTSF) of such data is critical for numerous applications. However, existing methods almost universally employ a shared embedding layer that processes all channels identically, creating a representational bottleneck that obscures valuable channel-specific information. To address this challenge, we introduce a Shared-Auxiliary Embedding (SAE) framework that decomposes the embedding into a shared base component capturing common patterns and channel-specific auxiliary components modeling unique deviations. Within this decomposition, we \rev{empirically observe} that the auxiliary components tend to exhibit low-rank and clustering characteristics, a structural pattern that is significantly less apparent when using purely independent embeddings. Consequently, we design LightSAE, a parameter-efficient embedding module that operationalizes these observed characteristics through low-rank factorization and a shared, gated component pool. Extensive experiments across 9 IoT-related datasets and 4 backbone architectures demonstrate LightSAE's effectiveness, achieving MSE improvements of up to 22.8\% with only 4.0\% parameter increase.

Paper Structure

This paper contains 35 sections, 13 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Illustration of our motivation. (a) Channel heterogeneity in the Electricity dataset electricityloaddiagrams20112014_321. Four representative channels demonstrate distinct temporal patterns and statistical distributions: irregular fluctuations (top), trend with periodic spikes (second), regular oscillations (third), and square-wave patterns (bottom). (b) Existing MTSF methods uniformly adopt a shared embedding layer.
  • Figure 2: A comparison of different embedding strategies for MTSF. (a) The general pipeline for deep MTSF models. (b) Standard Shared Embedding, where all channels use a single embedding layer. (c) Independent Channel Embedding, where each channel has its own separate embedding layer. (d) Our proposed Shared-Auxiliary Embedding (SAE), which combines a shared base with channel-specific auxiliary components. (e) Our final LightSAE module, which enhances SAE with parameter-efficient low-rank components and a component pool.
  • Figure 3: Cumulative energy ratio comparison for different embedding mechanisms. "Shared Weight" represents the shared component in SAE, "Auxiliary Avg" represents the averaged value of auxiliary components $\bm{W}_{c_i}$ from SAE, and "Ind Avg" represents the averaged value from Independent Channel Embedding weights, $\bm{W}_i$.
  • Figure 4: Comparison of cosine similarity patterns between channel weights. (a) Similarity among Independent Channel Embedding weights $\bm{W}_i$ on Electricity dataset. (b) Similarity among SAE auxiliary weights $\bm{W}_{c_i}$ on Electricity dataset. (c) Similarity among Independent Channel Embedding weights $\bm{W}_i$ on PEMS04 dataset. (d) Similarity among SAE auxiliary weights $\bm{W}_{c_i}$ on PEMS04 dataset.
  • Figure 5: Performance improvement vs. number of channels across four backbone models.
  • ...and 6 more figures