Graph Coloring for Multi-Task Learning
Santosh Patapati
TL;DR
Gradient interference in multitask learning slows convergence and degrades performance when tasks pull updates in conflicting directions. SON-GOKU addresses this by online estimation of cross-task gradient interference, constructing a sparse conflict graph, and greedily coloring to form low-conflict task groups updated sequentially; the graph is refreshed periodically to track evolving relationships. The approach offers descent guarantees, preserves the standard nonconvex SGD rate up to a small factor, and can recover population-level task partitions with high probability. Empirically, SON-GOKU yields consistent improvements across six datasets, often enhancing compatibility with existing MTL optimizers like AdaTask and PCGrad while maintaining scalable time and memory, making interference-aware scheduling practically appealing for diverse multitask settings.
Abstract
When different objectives conflict with each other in multi-task learning, gradients begin to interfere and slow convergence, thereby potentially reducing the final model's performance. To address this, we introduce SON-GOKU, a scheduler that computes gradient interference, constructs an interference graph, and then applies greedy graph-coloring to partition tasks into groups that align well with each other. At each training step, only one group (color class) of tasks are activated, and the grouping partition is constantly recomputed as task relationships evolve throughout training. By ensuring that each mini-batch contains only tasks that pull the model in the same direction, our method improves the effectiveness of any underlying multi-task learning optimizer without additional tuning. Since tasks within these groups will update in compatible directions, multi-task learning will improve model performance rather than impede it. Empirical results on six different datasets show that this interference-aware graph-coloring approach consistently outperforms baselines and state-of-the-art multi-task optimizers. We provide extensive theory showing why grouping and sequential updates improve multi-task learning, with guarantees on descent, convergence, and accurately identifying what tasks conflict or align.
