Fantastic Multi-Task Gradient Updates and How to Find Them In a Cone
Negar Hassanpour, Muhammad Kamran Janjua, Kunlin Zhang, Sepehr Lavasani, Xiaowen Zhang, Chunhua Zhou, Chao Gao
TL;DR
This work tackles gradient conflicts in multi-task learning by introducing ConicGrad, a gradient update within a cone around the reference gradient $g_0$ defined by the average task gradient. It formulates a constrained max-min objective, yields a closed-form update via duality and an efficient Sherman–Morrison–Woodbury-based computation for $d^{*}$, and decouples direction from magnitude through normalization. The authors establish convergence guarantees under standard Lipschitz assumptions and demonstrate, across toy, supervised, and reinforcement learning benchmarks, that ConicGrad often achieves state-of-the-art or competitive performance with favorable scalability and stability. The method offers practical benefits for high-dimensional models and diverse task sets, with future work focusing on dynamically adapting the cone angle $c$ during training.
Abstract
Balancing competing objectives remains a fundamental challenge in multi-task learning (MTL), primarily due to conflicting gradients across individual tasks. A common solution relies on computing a dynamic gradient update vector that balances competing tasks as optimization progresses. Building on this idea, we propose ConicGrad, a principled, scalable, and robust MTL approach formulated as a constrained optimization problem. Our method introduces an angular constraint to dynamically regulate gradient update directions, confining them within a cone centered on the reference gradient of the overall objective. By balancing task-specific gradients without over-constraining their direction or magnitude, ConicGrad effectively resolves inter-task gradient conflicts. Moreover, our framework ensures computational efficiency and scalability to high-dimensional parameter spaces. We conduct extensive experiments on standard supervised learning and reinforcement learning MTL benchmarks, and demonstrate that ConicGrad achieves state-of-the-art performance across diverse tasks.
