Multi-Task Dense Prediction via Mixture of Low-Rank Experts
Yuqi Yang, Peng-Tao Jiang, Qibin Hou, Hao Zhang, Jinwei Chen, Bo Li
TL;DR
MLoRE tackles decoder-focused dense multi-task prediction by explicitly modeling global task relations through a shared task-sharing convolution path and scaling capacity with low-rank, linear MoE experts. The three-path design—task-sharing generic, shared low-rank experts with per-task routing, and task-specific low-rank experts—enables both cross-task correlation and task discrimination while keeping parameters and FLOPs in check through linearity and re-parameterization at inference. Key contributions include formalizing a global-relations-aware MoE for decoders, introducing effective low-rank convolutions in the MoE, and demonstrating significant gains on PASCAL-Context and NYUD-v2 with efficient deployment. The approach achieves state-of-the-art results across multiple dense prediction tasks and offers practical advantages for scalable, decoder-focused multi-task learning in vision systems.
Abstract
Previous multi-task dense prediction methods based on the Mixture of Experts (MoE) have received great performance but they neglect the importance of explicitly modeling the global relations among all tasks. In this paper, we present a novel decoder-focused method for multi-task dense prediction, called Mixture-of-Low-Rank-Experts (MLoRE). To model the global task relationships, MLoRE adds a generic convolution path to the original MoE structure, where each task feature can go through this path for explicit parameter sharing. Furthermore, to control the parameters and computational cost brought by the increase in the number of experts, we take inspiration from LoRA and propose to leverage the low-rank format of a vanilla convolution in the expert network. Since the low-rank experts have fewer parameters and can be dynamically parameterized into the generic convolution, the parameters and computational cost do not change much with the increase of experts. Benefiting from this design, we increase the number of experts and its reception field to enlarge the representation capacity, facilitating multiple dense tasks learning in a unified network. Extensive experiments on the PASCAL-Context and NYUD-v2 benchmarks show that our MLoRE achieves superior performance compared to previous state-of-the-art methods on all metrics. Our code is available at https://github.com/YuqiYang213/MLoRE.
