Dataless Weight Disentanglement in Task Arithmetic via Kronecker-Factored Approximate Curvature
Angelo Porrello, Pietro Buzzega, Felix Dangel, Thomas Sommariva, Riccardo Salami, Lorenzo Bonicelli, Simone Calderara
TL;DR
Task Arithmetic enables modular model edits but suffers from cross-task interference when combining task vectors. The authors recast representation drift as a curvature-based penalty and implement a dataless regularizer (TAK) based on Kronecker-Factored Approximate Curvature to approximate the Generalized Gauss-Newton. They introduce a merging strategy that aggregates per-task curvature factors, achieving constant complexity in the number of tasks and robustness to task-scale. Empirical results on vision and language tasks demonstrate state-of-the-art performance on task addition and negation, with strong data privacy properties and efficient training. The work advances practical, privacy-preserving composition of foundation models.
Abstract
Task Arithmetic yields a modular, scalable way to adapt foundation models. Combining multiple task vectors, however, can lead to cross-task interference, causing representation drift and degraded performance. Representation drift regularization provides a natural remedy to disentangle task vectors; however, existing approaches typically require external task data, conflicting with modularity and data availability constraints (e.g., privacy requirements). We propose a dataless approach by framing regularization against representation drift as a curvature matrix approximation problem. This allows us to leverage well-established techniques; in particular, we adopt Kronecker-Factored Approximate Curvature and obtain a practical regularizer that achieves state-of-the-art results in task addition and negation. Our method has constant complexity in the number of tasks and promotes robustness to task vector rescaling, eliminating the need for held-out tuning.
