Quantifying Task Priority for Multi-Task Optimization
Wooseong Jeong, Kuk-Jin Yoon
TL;DR
This work addresses negative transfer in multi-task learning by reframing parameter updates in terms of task priority and connection strength. It introduces a two-phase optimization: Phase 1 learns per-task priorities by updating shared parameters sequentially through task-specific connections, aiming to discover new Pareto-optimal solutions; Phase 2 preserves these priorities by computing a normalized connection-strength measure and projecting gradients to align with the top-priority task per channel. The authors prove that incorporating task priority expands the Pareto frontier and provide convergence arguments, and they validate the approach across NYUD-v2, PASCAL-Context, and Cityscapes, showing superior multi-task performance over gradient-manipulation baselines under various loss-scaling schemes. The method achieves robustness across architectures with minimal parameter overhead and demonstrates practical impact for complex, multi-task vision systems.
Abstract
The goal of multi-task learning is to learn diverse tasks within a single unified network. As each task has its own unique objective function, conflicts emerge during training, resulting in negative transfer among them. Earlier research identified these conflicting gradients in shared parameters between tasks and attempted to realign them in the same direction. However, we prove that such optimization strategies lead to sub-optimal Pareto solutions due to their inability to accurately determine the individual contributions of each parameter across various tasks. In this paper, we propose the concept of task priority to evaluate parameter contributions across different tasks. To learn task priority, we identify the type of connections related to links between parameters influenced by task-specific losses during backpropagation. The strength of connections is gauged by the magnitude of parameters to determine task priority. Based on these, we present a new method named connection strength-based optimization for multi-task learning which consists of two phases. The first phase learns the task priority within the network, while the second phase modifies the gradients while upholding this priority. This ultimately leads to finding new Pareto optimal solutions for multiple tasks. Through extensive experiments, we show that our approach greatly enhances multi-task performance in comparison to earlier gradient manipulation methods.
