Table of Contents
Fetching ...

SiamTST: A Novel Representation Learning Framework for Enhanced Multivariate Time Series Forecasting applied to Telco Networks

Simen Kristoffersen, Peter Skaar Nordby, Sara Malacarne, Massimiliano Ruocco, Pablo Ortiz

TL;DR

SiamTST addresses multivariate time series forecasting in telecom networks by learning robust representations through a Siamese Time Series Transformer with channel-independent patching, RMSNorm, and QKNorm in self-attention, coupled with a pre-training stage and a simple linear head for forecasting. The approach is evaluated on a large real-world Telenor Denmark dataset, comparing against state-of-the-art representation learning methods (TS2Vec, CoST, SimTS, PatchTST) and strong baselines (LinearNet, Ridge). Key findings show SiamTST achieves superior MAE and MSE across forecast horizons $24$, $48$, $96$, and $168$ hours, with larger gains at longer horizons; pre-training across multiple sectors further boosts performance, with diminishing returns after about 50 sectors. The work provides public PyTorch code and establishes a scalable, transferable framework for industrial MTS forecasting with potential applicability to other domains. These results imply broader utility in network traffic management and optimization and prompt exploration of SiamTST in other complex MTS contexts.

Abstract

We introduce SiamTST, a novel representation learning framework for multivariate time series. SiamTST integrates a Siamese network with attention, channel-independent patching, and normalization techniques to achieve superior performance. Evaluated on a real-world industrial telecommunication dataset, SiamTST demonstrates significant improvements in forecasting accuracy over existing methods. Notably, a simple linear network also shows competitive performance, achieving the second-best results, just behind SiamTST. The code is available at https://github.com/simenkristoff/SiamTST.

SiamTST: A Novel Representation Learning Framework for Enhanced Multivariate Time Series Forecasting applied to Telco Networks

TL;DR

SiamTST addresses multivariate time series forecasting in telecom networks by learning robust representations through a Siamese Time Series Transformer with channel-independent patching, RMSNorm, and QKNorm in self-attention, coupled with a pre-training stage and a simple linear head for forecasting. The approach is evaluated on a large real-world Telenor Denmark dataset, comparing against state-of-the-art representation learning methods (TS2Vec, CoST, SimTS, PatchTST) and strong baselines (LinearNet, Ridge). Key findings show SiamTST achieves superior MAE and MSE across forecast horizons , , , and hours, with larger gains at longer horizons; pre-training across multiple sectors further boosts performance, with diminishing returns after about 50 sectors. The work provides public PyTorch code and establishes a scalable, transferable framework for industrial MTS forecasting with potential applicability to other domains. These results imply broader utility in network traffic management and optimization and prompt exploration of SiamTST in other complex MTS contexts.

Abstract

We introduce SiamTST, a novel representation learning framework for multivariate time series. SiamTST integrates a Siamese network with attention, channel-independent patching, and normalization techniques to achieve superior performance. Evaluated on a real-world industrial telecommunication dataset, SiamTST demonstrates significant improvements in forecasting accuracy over existing methods. Notably, a simple linear network also shows competitive performance, achieving the second-best results, just behind SiamTST. The code is available at https://github.com/simenkristoff/SiamTST.
Paper Structure (13 sections, 5 equations, 3 figures, 3 tables)

This paper contains 13 sections, 5 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: SiamTST's Transformer encoder module compared to the original Transformer encoder module from vaswani2017attention. Our version moves the normalization layer ahead of the residual connections, replaces LayerNorm with RMSNorm, and applies QKNorm at the multi-head attention layer.
  • Figure 2: The figure shows 168-hour forecasts for the feature $msdr\_denom$ for a given sector. The forecasts are made by the following models: SiamTST, PatchTST, and LinearNet. The orange line displays the true future values.
  • Figure 3: The figure shows 168-hour forecasts for the feature $mcdr\_denom$ for a given sector. The forecasts are made by the baseline model and models pre-trained on $5$, $10$, $50$, and $98$ sectors. The orange line displays the true future values.