Time-Series Foundation Models for ISP Traffic Forecasting
Fan Liu, Behrooz Farkiani, Patrick Crowley
TL;DR
The paper tackles the challenge of scalable ISP traffic forecasting by evaluating a pretrained time-series foundation model, IBM's Tiny Time Mixer (TTM), on the CESNET-TimeSeries24 dataset. It compares zero-shot and few-shot approaches, across multiple resolutions and hierarchical aggregation levels, and demonstrates that TTM achieves consistent predictive accuracy with $RMSE$ in the $0.026$–$0.057$ range and stable $R^2$ across horizons, while delivering CPU-only inference under $0.05$ s per $100$ points. The results show robust cross-level generalization, modest gains from limited fine-tuning, and negligible benefit from exogenous features, highlighting the practicality of training-free deployment in real ISP environments. Overall, the work supports a train-once, deploy-everywhere paradigm for scalable network monitoring, enabling efficient, training-free forecasting without specialized hardware.
Abstract
Accurate network-traffic forecasting enables proactive capacity planning and anomaly detection in Internet Service Provider (ISP) networks. Recent advances in time-series foundation models (TSFMs) have demonstrated strong zero-shot and few-shot generalization across diverse domains, yet their effectiveness for computer networking remains unexplored. This paper presents a systematic evaluation of a TSFM, IBM's Tiny Time Mixer (TTM), on the CESNET-TimeSeries24 dataset, a 40-week real-world ISP telemetry corpus. We assess TTM under zero-shot and few-shot settings across multiple forecasting horizons (hours to days), aggregation hierarchies (institutions, subnets, IPs), and temporal resolutions (10-minute and hourly). Results show that TTM achieves consistent accuracy (RMSE 0.026-0.057) and stable $R^2$ scores across horizons and context lengths, outperforming or matching fully trained deep learning baselines such as GRU and LSTM. Inference latency remains under 0.05s per 100 points on a single MacBook Pro using CPU-only computation, confirming deployability without dedicated GPU or MPS acceleration. These findings highlight the potential of pretrained TSFMs to enable scalable, efficient, and training-free forecasting for modern network monitoring and management systems.
