Test-Time Efficient Pretrained Model Portfolios for Time Series Forecasting
Mert Kayaalp, Caner Turkmen, Oleksandr Shchur, Pedro Mercado, Abdul Fatir Ansari, Michael Bohlke-Schneider, Bernie Wang
TL;DR
This paper tackles the high computational cost of large pretrained time-series forecasters by proposing Chroma, a portfolio of small, pretrained forecasters formed by post-training a generalist into frequency- and domain-specialists. At test time, predictions are produced via model selection or greedy ensemble methods, achieving competitive accuracy with far fewer active parameters than monolithic models and with favorable compute-efficiency trade-offs versus test-time fine-tuning. The approach demonstrates that specialist portfolios, aided by post-training, can scale similarly to generalist models and yield interpretability through activation patterns across specialists. Overall, Chroma offers a modular, scalable framework for test-time efficient forecasting that could extend to other domains, offering a practical alternative to best-of-$N$ sampling from a single base model.
Abstract
Is bigger always better for time series foundation models? With the question in mind, we explore an alternative to training a single, large monolithic model: building a portfolio of smaller, pretrained forecasting models. By applying ensembling or model selection over these portfolios, we achieve competitive performance on large-scale benchmarks using much fewer parameters. We explore strategies for designing such portfolios and find that collections of specialist models consistently outperform portfolios of independently trained generalists. Remarkably, we demonstrate that post-training a base model is a compute-effective approach for creating sufficiently diverse specialists, and provide evidences that ensembling and model selection are more compute-efficient than test-time fine-tuning.
