Are Time-Series Foundation Models Deployment-Ready? A Systematic Study of Adversarial Robustness Across Domains
Jiawen Zhang, Zhenwei Zhang, Shun Zheng, Xumeng Wen, Jia Li, Jiang Bian
TL;DR
<3-5 sentence high-level summary> The paper systematically evaluates the adversarial robustness of Time-Series Foundation Models (TSFMs) using a time-series–grounded framework that normalizes perturbation budgets and unifies evaluation across white-box and black-box settings. It reveals that current TSFMs are highly brittle, with vulnerabilities such as horizon-proximal brittleness and context-length amplification, and that attack transfer across models is limited. Targeted and untargeted attacks can steer forecasts toward attacker-defined trajectories even at small budgets, highlighting safety risks in deployment. The authors demonstrate that lightweight defenses like latent or input-space adversarial training substantially improve worst-case robustness and can transfer across domains, offering a viable path toward deployment-ready TSFMs. Overall, the work underscores that robustness should be treated as a prerequisite alongside accuracy for safe TSFM deployment in real-world decision making.
Abstract
Time-Series Foundation Models (TSFMs) are rapidly transitioning from research prototypes to core components of critical decision-making systems, driven by their impressive zero-shot forecasting capabilities. However, as their deployment surges, a critical blind spot remains: their fragility under adversarial attacks. This lack of scrutiny poses severe risks, particularly as TSFMs enter high-stakes environments vulnerable to manipulation. We present a systematic, diagnostic study arguing that for TSFMs, robustness is not merely a secondary metric but a prerequisite for trustworthy deployment comparable to accuracy. Our evaluation framework, explicitly tailored to the unique constraints of time series, incorporates normalized, sparsity-aware perturbation budgets and unified scale-invariant metrics across white-box and black-box settings. Across six representative TSFMs, we demonstrate that current architectures are alarmingly brittle: even small perturbations can reliably steer forecasts toward specific failure modes, such as trend flips and malicious drifts. We uncover TSFM-specific vulnerability patterns, including horizon-proximal brittleness, increased susceptibility with longer context windows, and weak cross-model transfer that points to model-specific failure modes rather than generic distortions. Finally, we show that simple adversarial fine-tuning offers a cost-effective path to substantial robustness gains, even with out-of-domain data. This work bridges the gap between TSFM capabilities and safety constraints, offering essential guidance for hardening the next generation of forecasting systems.
