PSMGD: Periodic Stochastic Multi-Gradient Descent for Fast Multi-Objective Optimization
Mingjing Xu, Peizhong Ju, Jia Liu, Haibo Yang
TL;DR
PSMGD addresses the heavy per-iteration cost of gradient-manipulation in multi-objective optimization by periodically recomputing objective weights and reusing them, leveraging the observed stability of dynamic weights. The method achieves state-of-the-art convergence rates for strongly convex, convex, and non-convex objectives and introduces Backpropagation (BP) complexity to quantify computational workload, showing potential objective-independent BP when the recomputation interval scales with the number of objectives. Theoretical guarantees are complemented by extensive experiments on QM-9 and NYU-v2, where PSMGD delivers comparable or superior performance with significantly faster training times than existing MOO methods. This combination of theoretical efficiency and empirical competitiveness suggests PSMGD as a practical, scalable approach for fast multi-objective optimization in deep learning and related tasks.
Abstract
Multi-objective optimization (MOO) lies at the core of many machine learning (ML) applications that involve multiple, potentially conflicting objectives (e.g., multi-task learning, multi-objective reinforcement learning, among many others). Despite the long history of MOO, recent years have witnessed a surge in interest within the ML community in the development of gradient manipulation algorithms for MOO, thanks to the availability of gradient information in many ML problems. However, existing gradient manipulation methods for MOO often suffer from long training times, primarily due to the need for computing dynamic weights by solving an additional optimization problem to determine a common descent direction that can decrease all objectives simultaneously. To address this challenge, we propose a new and efficient algorithm called Periodic Stochastic Multi-Gradient Descent (PSMGD) to accelerate MOO. PSMGD is motivated by the key observation that dynamic weights across objectives exhibit small changes under minor updates over short intervals during the optimization process. Consequently, our PSMGD algorithm is designed to periodically compute these dynamic weights and utilizes them repeatedly, thereby effectively reducing the computational overload. Theoretically, we prove that PSMGD can achieve state-of-the-art convergence rates for strongly-convex, general convex, and non-convex functions. Additionally, we introduce a new computational complexity measure, termed backpropagation complexity, and demonstrate that PSMGD could achieve an objective-independent backpropagation complexity. Through extensive experiments, we verify that PSMGD can provide comparable or superior performance to state-of-the-art MOO algorithms while significantly reducing training time.
