Curriculum Learning and Imitation Learning for Model-free Control on Financial Time-series
Woosung Koh, Insu Choi, Yuntae Jang, Gimin Kang, Woo Chang Kim
TL;DR
The paper tackles the challenge of model-free control on financial time-series under a fixed data-generating process by assessing Curriculum Learning (via data smoothing) and Imitation Learning (via direct policy distillation) in a portfolio-control MDP. It demonstrates that EMA-based curriculum learning yields consistent out-of-sample improvements across two representative datasets, while imitation learning tends to underperform in highly stochastic markets. The work provides theoretical and empirical analyses of signal-noise decomposition, showing that smoothing can reduce detrimental noise but may suppress useful signal, and it discusses when imitation learning may be harmful. Overall, the study positions curriculum learning as a promising, practical direction for time-series control in finance, and it highlights caution and further research needed for imitation learning in such noisy domains.
Abstract
Curriculum learning and imitation learning have been leveraged extensively in the robotics domain. However, minimal research has been done on leveraging these ideas on control tasks over highly stochastic time-series data. Here, we theoretically and empirically explore these approaches in a representative control task over complex time-series data. We implement the fundamental ideas of curriculum learning via data augmentation, while imitation learning is implemented via policy distillation from an oracle. Our findings reveal that curriculum learning should be considered a novel direction in improving control-task performance over complex time-series. Our ample random-seed out-sample empirics and ablation studies are highly encouraging for curriculum learning for time-series control. These findings are especially encouraging as we tune all overlapping hyperparameters on the baseline -- giving an advantage to the baseline. On the other hand, we find that imitation learning should be used with caution.
