Continual Optimization with Symmetry Teleportation for Multi-Task Learning
Zhipeng Zhou, Ziqiao Meng, Pengcheng Wu, Peilin Zhao, Chunyan Miao
TL;DR
This work tackles optimization conflicts and task imbalance in multi-task learning by introducing COST, a practical continual optimization framework based on symmetry teleportation. COST uses a low-rank adapter (LoRA) to teleport the shared backbone to a loss-invariant, higher-gradient point at confictful moments, while enforcing loss invariance through a convex-like drift term and guiding gradient advancement via a sharpness-based objective. A historical trajectory reuse strategy preserves optimizer momentum across teleports, enabling continued benefit from advanced optimizers such as Adam. Empirically, COST achieves state-of-the-art or competitive results across diverse MT benchmarks, and its plug-and-play nature demonstrates broad compatibility with existing MTL methods.
Abstract
Multi-task learning (MTL) is a widely explored paradigm that enables the simultaneous learning of multiple tasks using a single model. Despite numerous solutions, the key issues of optimization conflict and task imbalance remain under-addressed, limiting performance. Unlike existing optimization-based approaches that typically reweight task losses or gradients to mitigate conflicts or promote progress, we propose a novel approach based on Continual Optimization with Symmetry Teleportation (COST). During MTL optimization, when an optimization conflict arises, we seek an alternative loss-equivalent point on the loss landscape to reduce conflict. Specifically, we utilize a low-rank adapter (LoRA) to facilitate this practical teleportation by designing convergent, loss-invariant objectives. Additionally, we introduce a historical trajectory reuse strategy to continually leverage the benefits of advanced optimizers. Extensive experiments on multiple mainstream datasets demonstrate the effectiveness of our approach. COST is a plug-and-play solution that enhances a wide range of existing MTL methods. When integrated with state-of-the-art methods, COST achieves superior performance.
