Subspace-Boosted Model Merging

Ronald Skorobogat; Karsten Roth; Mariana-Iuliana Georgescu

Subspace-Boosted Model Merging

Ronald Skorobogat, Karsten Roth, Mariana-Iuliana Georgescu

TL;DR

The paper identifies rank collapse as a fundamental limitation of Task Arithmetic-based model merging, where common information increasingly dominates task-specific signals as more experts are merged. It introduces Subspace Boosting to recover suppressed task directions by boosting underutilized singular values in the task-vector space, significantly boosting merging performance across vision and language benchmarks (often by >10%) and across multiple merging methods. For interpretability, it develops Higher-Order GSVD to project task vectors into a shared subspace, enabling direct comparison of experts via Alignment Matrices and even enabling principled expert selection. The combination of Subspace Boosting and HO-GSVD provides both practical performance gains and a transparent framework for understanding and choosing among merged experts.

Abstract

Model merging enables the combination of multiple specialized expert models into a single model capable of performing multiple tasks. However, the benefits of merging an increasing amount of specialized experts generally lead to diminishing returns and reduced overall performance gains. In this work, we empirically and theoretically analyze this limitation, proving that for Task Arithmetic-based methods, as more experts are merged, the common information dominates the task-specific information, leading to inevitable rank collapse. To mitigate this issue, we introduce Subspace Boosting, which operates on the singular value decomposed task vector space and maintains task vector ranks. Subspace Boosting raises merging efficacy for up to 20 experts by large margins of more than 10% when evaluated on both vision and language benchmarks. Moreover, we propose employing Higher-Order Generalized Singular Value Decomposition to quantify task similarity, offering a new interpretable perspective on model merging. Code and models are available at https://github.com/ronskoro/Subspace-Boosting.

Subspace-Boosted Model Merging

TL;DR

Abstract

Subspace-Boosted Model Merging

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (14)

Theorems & Definitions (11)