Monotonic Transformation Invariant Multi-task Learning
Surya Murthy, Kushagra Gupta, Mustafa O. Karabag, David Fridovich-Keil, Ufuk Topcu
TL;DR
This work addresses the instability caused by arbitrary monotonic scaling of task losses in multi-task learning. It introduces DiBS-MTL, a monotonic-transformation-invariant adaptation of the Direction-based Bargaining Solution that operates on normalized task gradients to achieve Pareto-stationary updates even in nonconvex settings. The authors prove subsequential convergence to a Pareto stationary point under standard assumptions and demonstrate through extensive experiments that DiBS-MTL outperforms state-of-the-art baselines when task losses are poorly scaled, while remaining competitive on standard benchmarks. The approach offers a principled, efficient alternative for robust, multi-task optimization in heterogeneous loss landscapes.
Abstract
Multi-task learning (MTL) algorithms typically rely on schemes that combine different task losses or their gradients through weighted averaging. These methods aim to find Pareto stationary points by using heuristics that require access to task loss values, gradients, or both. In doing so, a central challenge arises because task losses can be arbitrarily scaled relative to one another, causing certain tasks to dominate training and degrade overall performance. A recent advance in cooperative bargaining theory, the Direction-based Bargaining Solution (DiBS), yields Pareto stationary solutions immune to task domination because of its invariance to monotonic nonaffine task loss transformations. However, the convergence behavior of DiBS in nonconvex MTL settings is currently not understood. To this end, we prove that under standard assumptions, a subsequence of DiBS iterates converges to a Pareto stationary point when task losses are nonconvex, and propose DiBS-MTL, an adaptation of DiBS to the MTL setting which is more computationally efficient that prior bargaining-inspired MTL approaches. Finally, we empirically show that DiBS-MTL is competitive with leading MTL methods on standard benchmarks, and it drastically outperforms state-of-the-art baselines in multiple examples with poorly-scaled task losses, highlighting the importance of invariance to nonaffine monotonic transformations of the loss landscape. Code available at https://github.com/suryakmurthy/dibs-mtl
