Which Tasks Should Be Learned Together in Multi-task Learning?
Trevor Standley, Amir R. Zamir, Dawn Chen, Leonidas Guibas, Jitendra Malik, Silvio Savarese
TL;DR
This work tackles which tasks should be learned together in multi-task learning under a fixed inference-time budget. It introduces a task grouping framework that evaluates all non-empty task subsets to form a small set of networks, each solving a subset of tasks, to optimize overall accuracy within the budget. The authors show that task relationships are highly setup-dependent and that naive joint training can underperform compared with carefully grouped task networks; they offer two training-time approximations—ESA and HOA—that make finding near-optimal groupings practical. Across multiple settings, their approach outperforms single-task and full joint baselines, highlighting the importance of automatic task grouping for real-time multi-task vision systems.
Abstract
Many computer vision applications require solving multiple tasks in real-time. A neural network can be trained to solve multiple tasks simultaneously using multi-task learning. This can save computation at inference time as only a single network needs to be evaluated. Unfortunately, this often leads to inferior overall performance as task objectives can compete, which consequently poses the question: which tasks should and should not be learned together in one network when employing multi-task learning? We study task cooperation and competition in several different learning settings and propose a framework for assigning tasks to a few neural networks such that cooperating tasks are computed by the same neural network, while competing tasks are computed by different networks. Our framework offers a time-accuracy trade-off and can produce better accuracy using less inference time than not only a single large multi-task neural network but also many single-task networks.
