Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models

Yunzhong Qiu; Zhiyao Cen; Zhongyi Pei; Chen Wang; Jianmin Wang

Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models

Yunzhong Qiu, Zhiyao Cen, Zhongyi Pei, Chen Wang, Jianmin Wang

TL;DR

A data-centric framework, time-series adaptive transformation optimization (TATO), that enables a single frozen pre-trained LTM to adapt to diverse downstream domains through an optimally configured transformation pipeline, making it practical for real-world deployment.

Abstract

Large time series models (LTMs) have emerged as powerful tools for universal forecasting, yet they often struggle with the inherent diversity and nonstationarity of real-world time series data, leading to an unsatisfactory trade-off between forecasting accuracy and generalization. Rather than continually finetuning new LTM instances for each domain, we propose a data-centric framework, time-series adaptive transformation optimization (TATO), that enables a single frozen pre-trained LTM to adapt to diverse downstream domains through an optimally configured transformation pipeline. Specifically, TATO constructs three representative types of transformations, including context slicing, scale normalization, and outlier correction, to help LTMs better align with target domain characteristics. To ensure robustness, we incorporate carefully selected time series augmentations and a two-stage ranking mechanism that filters out pipelines underperforming on specific metrics. Extensive experiments on state-of-the-art LTMs and widely used datasets demonstrate that TATO consistently and significantly improves domain-adaptive forecasting performance, achieving a maximum reduction in MSE of 65.4\% and an average reduction of 13.6\%. Moreover, TATO is highly efficient, typically completing optimization in under 2 minutes, making it practical for real-world deployment. The source code is available at https://github.com/thulab/TATO.

Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models

TL;DR

Abstract

Paper Structure (35 sections, 10 equations, 5 figures, 8 tables)

This paper contains 35 sections, 10 equations, 5 figures, 8 tables.

Introduction
Related Work
Large Time Series Models
Time Series Transformation
Test-Time Adaptation
Method
The Paradigm of Adapting Data to Model
The TATO Framework
Data Preparation
Transformation Pipeline Optimization
Two-stage Pareto-based Ranking of Performance
Experiments
Datasets and Baselines
Evaluation method
Effectiveness
...and 20 more sections

Figures (5)

Figure 1: Three examples illustrating how data transformations enhance LTM predictions. (a) Downsampling stabilizes noisy Moirai predictions on ETTm2. (b) Outlier detection and interpolation correct Timer's misinterpretation of anomalies on Weather. (c) Differencing enables Chronos to capture trends on Exchange by inducing stationarity. They demonstrate the potential of transformation optimization for FrozenForecasting.
Figure 2: Overview of the TATO framework. The framework consists of three main stages: (1) Data preparation, where diverse augmentations are applied to input samples to improve robustness; (2) Optimization of time series transformations, where a black-box optimizer searches for effective transformation pipelines comprising various preprocessing operators (e.g., trimming, normalization, denoising); and (3) Two-stage pipeline selection, where candidate pipelines are first filtered via Pareto ranking on validation metrics, followed by weighted multi-indicator ranking to select the optimal transformation pipeline for frozen LTM forecasting.
Figure 3: Distribution of MAE before and after applying TATO on three representative tasks. Across all datasets, TATO consistently shifts the error distribution toward lower values, indicating improved forecasting accuracy compared to the vanilla baseline.
Figure 4: Scalability analysis of TATO. (a) MSE improvement with increasing transformation trials (fixed 100 samples). (b) MSE improvement with increasing sample size (fixed 100 trials). Performance consistently improves with more trials and data, ranging from 50 to 500 in both dimensions.
Figure 5: Ablation study results. (a) Effect of removing key framework components on the reduction of MSE. (b) Effect of removing individual transformation operators on MSE reduction. Mean and median %Promotion are shown for each variant.

Theorems & Definitions (3)

definition 1: Frozen Foundation Model-based Domain-shared Forecasting
definition 2: Time-series Adaptive Transformation Optimization for FrozenForecasting
definition 3: Relative Percentage Promotion in Error

Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models

TL;DR

Abstract

Adapt Data to Model: Adaptive Transformation Optimization for Domain-shared Time Series Foundation Models

Authors

TL;DR

Abstract

Table of Contents

Figures (5)

Theorems & Definitions (3)