Time-Series Foundation Models for ISP Traffic Forecasting

Fan Liu; Behrooz Farkiani; Patrick Crowley

Time-Series Foundation Models for ISP Traffic Forecasting

Fan Liu, Behrooz Farkiani, Patrick Crowley

TL;DR

The paper tackles the challenge of scalable ISP traffic forecasting by evaluating a pretrained time-series foundation model, IBM's Tiny Time Mixer (TTM), on the CESNET-TimeSeries24 dataset. It compares zero-shot and few-shot approaches, across multiple resolutions and hierarchical aggregation levels, and demonstrates that TTM achieves consistent predictive accuracy with $RMSE$ in the $0.026$–$0.057$ range and stable $R^2$ across horizons, while delivering CPU-only inference under $0.05$ s per $100$ points. The results show robust cross-level generalization, modest gains from limited fine-tuning, and negligible benefit from exogenous features, highlighting the practicality of training-free deployment in real ISP environments. Overall, the work supports a train-once, deploy-everywhere paradigm for scalable network monitoring, enabling efficient, training-free forecasting without specialized hardware.

Abstract

Accurate network-traffic forecasting enables proactive capacity planning and anomaly detection in Internet Service Provider (ISP) networks. Recent advances in time-series foundation models (TSFMs) have demonstrated strong zero-shot and few-shot generalization across diverse domains, yet their effectiveness for computer networking remains unexplored. This paper presents a systematic evaluation of a TSFM, IBM's Tiny Time Mixer (TTM), on the CESNET-TimeSeries24 dataset, a 40-week real-world ISP telemetry corpus. We assess TTM under zero-shot and few-shot settings across multiple forecasting horizons (hours to days), aggregation hierarchies (institutions, subnets, IPs), and temporal resolutions (10-minute and hourly). Results show that TTM achieves consistent accuracy (RMSE 0.026-0.057) and stable $R^2$ scores across horizons and context lengths, outperforming or matching fully trained deep learning baselines such as GRU and LSTM. Inference latency remains under 0.05s per 100 points on a single MacBook Pro using CPU-only computation, confirming deployability without dedicated GPU or MPS acceleration. These findings highlight the potential of pretrained TSFMs to enable scalable, efficient, and training-free forecasting for modern network monitoring and management systems.

Time-Series Foundation Models for ISP Traffic Forecasting

TL;DR

in the

–

range and stable

across horizons, while delivering CPU-only inference under

s per

points. The results show robust cross-level generalization, modest gains from limited fine-tuning, and negligible benefit from exogenous features, highlighting the practicality of training-free deployment in real ISP environments. Overall, the work supports a train-once, deploy-everywhere paradigm for scalable network monitoring, enabling efficient, training-free forecasting without specialized hardware.

Abstract

scores across horizons and context lengths, outperforming or matching fully trained deep learning baselines such as GRU and LSTM. Inference latency remains under 0.05s per 100 points on a single MacBook Pro using CPU-only computation, confirming deployability without dedicated GPU or MPS acceleration. These findings highlight the potential of pretrained TSFMs to enable scalable, efficient, and training-free forecasting for modern network monitoring and management systems.

Time-Series Foundation Models for ISP Traffic Forecasting

TL;DR

Abstract

Time-Series Foundation Models for ISP Traffic Forecasting

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (7)