Task Addition and Weight Disentanglement in Closed-Vocabulary Models
Adam Hazimeh, Alessandro Favero, Pascal Frossard
TL;DR
The paper investigates editing closed-vocabulary pre-trained image classifiers with task arithmetic, previously demonstrated mainly on open-vocabulary models. It defines task vectors $\tau_t = \theta_{\rm ft}^t - \theta_{\rm pre}$ and applies $\theta_{\rm new} = \theta_{\rm pre} + \sum_t \lambda_t \tau_t$ to fuse tasks, with an upfront head alignment via linear probing for closed settings. The main findings show that weight disentanglement is a general consequence of pre-training and enables effective task addition across supervised, self-supervised, and CLIP-like pre-training, with larger data and model scales enhancing performance. Furthermore, linear probing often matches or rivals task addition as a cheaper baseline, suggesting practical alternatives for multi-task editing in non-language-supervised models. Overall, the work broadens the applicability of task arithmetic to a wider class of pre-trained models and highlights the trade-offs between modularity and computational efficiency in multi-task deployment.
Abstract
Task arithmetic has recently emerged as a promising method for editing pre-trained \textit{open-vocabulary} models, offering a cost-effective alternative to standard multi-task fine-tuning. However, despite the abundance of \textit{closed-vocabulary} models that are not pre-trained with language supervision, applying task arithmetic to these models remains unexplored. In this paper, we deploy and study task addition in closed-vocabulary image classification models. We consider different pre-training schemes and find that \textit{weight disentanglement} -- the property enabling task arithmetic -- is a general consequence of pre-training, as it appears in different pre-trained closed-vocabulary models. In fact, we find that pre-trained closed-vocabulary vision transformers can also be edited with task arithmetic, achieving high task addition performance and enabling the efficient deployment of multi-task models. Finally, we demonstrate that simple linear probing is a competitive baseline to task addition. Overall, our findings expand the applicability of task arithmetic to a broader class of pre-trained models and open the way for more efficient use of pre-trained models in diverse settings.
