Commencing-Student Enrolment Forecasting Under Data Sparsity with Time Series Foundation Models
Jittarin Jetwiriyanon, Teo Susnjak, Surangika Ranathunga
TL;DR
This paper tackles the challenge of forecasting commencing university enrolments under data sparsity and regime shifts. It benchmarks zero-shot Time Series Foundation Models (TSFMs)—Moirai, Chronos, and TimesFM—against classical baselines, conditioning forecasts on leakage-safe covariates such as IOCI and Google Trends within an expanding-window backtest. The authors develop a portable covariate protocol and an auditable Institutional Operating Conditions Index (IOCI) to encode time-stamped narrative evidence into decision-time covariates, assessing both point accuracy and probabilistic calibration via metrics like $MAE$, $RMSE$, and $CRPS$. Key findings show that, in data-sparse annual settings, covariate-conditioned TSFMs can match or exceed traditional baselines, with performance varying by cohort and model, and that covariate effects are heterogeneous and model-dependent. The study provides a transferable framework for universities to deploy zero-shot TSFMs with principled covariates, highlighting the need for careful model-covariate alignment and robust calibration in practice.
Abstract
Many universities face increasing financial pressure and rely on accurate forecasts of commencing enrolments. However, enrolment forecasting in higher education is often data-sparse; annual series are short and affected by reporting changes and regime shifts. Popular classical approaches can be unreliable, as parameter estimation and model selection are unstable with short samples, and structural breaks degrade extrapolation. Recently, TSFMs have provided zero-shot priors, delivering strong gains in annual, data-sparse institutional forecasting under leakage-disciplined covariate construction. We benchmark multiple TSFM families in a zero-shot setting and test a compact, leakage-safe covariate set and introduce the Institutional Operating Conditions Index (IOCI), a transferable 0-100 regime covariate derived from time-stamped documentary evidence available at each forecast origin, alongside Google Trends demand proxies with stabilising feature engineering. Using an expanding-window backtest with strict vintage alignment, covariate-conditioned TSFMs perform on par with classical benchmarks without institution-specific training, with performance differences varying by cohort and model.
