The Relevance of AWS Chronos: An Evaluation of Standard Methods for Time Series Forecasting with Limited Tuning
Matthew Baron, Alex Karpinski
TL;DR
This study evaluates AWS Chronos against ARIMA and Prophet for time-series forecasting under limited tuning, using a bike-share demand dataset partitioned by user type. Chronos demonstrates strong performance on longer horizons and shows robustness to increasing context length, while traditional methods degrade with more historical data. The results reveal systematic differences across user types and forecast horizons, with naive baselines performing well at very short horizons. The findings support deploying Chronos in real-world, low-tuning settings for longer-range forecasts, and point to future work on incorporating exogenous covariates and multivariate forecasting within the Chronos framework.
Abstract
A systematic comparison of Chronos, a transformer-based time series forecasting framework, against traditional approaches including ARIMA and Prophet. We evaluate these models across multiple time horizons and user categories, with a focus on the impact of historical context length. Our analysis reveals that while Chronos demonstrates superior performance for longer-term predictions and maintains accuracy with increased context, traditional models show significant degradation as context length increases. We find that prediction quality varies systematically between user classes, suggesting that underlying behavior patterns always influence model performance. This study provides a case for deploying Chronos in real-world applications where limited model tuning is feasible, especially in scenarios requiring longer prediction.
