Beyond Shared Hierarchies: Deep Multitask Learning through Soft Layer Ordering
Elliot Meyerson, Risto Miikkulainen
TL;DR
The paper challenges the standard parallel ordering assumption in deep multitask learning and demonstrates that allowing flexible, task-specific layer usage through permuted and soft ordering can significantly improve cross-task sharing. It introduces a soft ordering mechanism that jointly learns how to apply shared layers at different depths for different tasks, outperforming fixed-order MTL and single-task baselines across MNIST, UCI, Omniglot, and CelebA. The results reveal that shared layers can serve as generalizable primitives assembled in task-dependent ways, suggesting a path toward scalable, modular building blocks for unseen tasks. Overall, soft ordering not only boosts performance but also provides insights into the functional roles of learned layers across diverse tasks.
Abstract
Existing deep multitask learning (MTL) approaches align layers shared between tasks in a parallel ordering. Such an organization significantly constricts the types of shared structure that can be learned. The necessity of parallel ordering for deep MTL is first tested by comparing it with permuted ordering of shared layers. The results indicate that a flexible ordering can enable more effective sharing, thus motivating the development of a soft ordering approach, which learns how shared layers are applied in different ways for different tasks. Deep MTL with soft ordering outperforms parallel ordering methods across a series of domains. These results suggest that the power of deep MTL comes from learning highly general building blocks that can be assembled to meet the demands of each task.
