Auxiliary task discovery through generate-and-test
Banafsheh Rafiee, Sina Ghiassian, Jun Jin, Richard Sutton, Jun Luo, Adam White
TL;DR
This paper addresses the challenge of autonomously discovering auxiliary tasks to improve reinforcement learning data efficiency. It introduces a generate-and-test framework where a generator proposes new auxiliary tasks and a tester evaluates them by measuring how much the induced features contribute to the main task, using a Master-User learning strategy to attribute feature changes to specific tasks. A new usefulness measure for auxiliary tasks is defined via feature-level contributions, and the approach is augmented with a replacement mechanism to prune ineffective tasks. Experimental results across gridworlds and a pinball domain show the method outperforms learning with no auxiliary tasks and fixed random tasks, while a feature-attainment variant offers improved scalability. The work provides a practical, tunable pathway toward automatic auxiliary task discovery and grounds future integration with meta-learning and larger-scale domains.
Abstract
In this paper, we explore an approach to auxiliary task discovery in reinforcement learning based on ideas from representation learning. Auxiliary tasks tend to improve data efficiency by forcing the agent to learn auxiliary prediction and control objectives in addition to the main task of maximizing reward, and thus producing better representations. Typically these tasks are designed by people. Meta-learning offers a promising avenue for automatic task discovery; however, these methods are computationally expensive and challenging to tune in practice. In this paper, we explore a complementary approach to the auxiliary task discovery: continually generating new auxiliary tasks and preserving only those with high utility. We also introduce a new measure of auxiliary tasks' usefulness based on how useful the features induced by them are for the main task. Our discovery algorithm significantly outperforms random tasks and learning without auxiliary tasks across a suite of environments.
