An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Yuan Cao; Dezhi Ran; Yuzhe Guo; Mengzhou Wu; Simin Chen; Linyi Li; Wei Yang; Tao Xie

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Yuan Cao, Dezhi Ran, Yuzhe Guo, Mengzhou Wu, Simin Chen, Linyi Li, Wei Yang, Tao Xie

TL;DR

This paper identifies and characterize the phenomenon of task-level merging collapse, where certain task combinations consistently trigger huge performance degradation across all merging methods, and provides a theoretical explanation on this phenomenon through rate-distortion theory with a dimension-dependent bound, establishing fundamental limits on task mergeability regardless of methodology.

Abstract

Model merging unifies independently fine-tuned LLMs from the same base, enabling reuse and integration of parallel development efforts without retraining. However, in practice we observe that merging does not always succeed: certain combinations of task-specialist models suffer from catastrophic performance degradation after merging. We refer to this failure mode as merging collapse. Intuitively, collapse arises when the learned representations or parameter adjustments for different tasks are fundamentally incompatible, so that merging forces destructive interference rather than synergy. In this paper, we identify and characterize the phenomenon of task-level merging collapse, where certain task combinations consistently trigger huge performance degradation across all merging methods. Through extensive experiments and statistical analysis, we demonstrate that representational incompatibility between tasks is strongly correlated with merging collapse, while parameter-space conflict metrics show minimal correlation, challenging conventional wisdom in model merging literature. We provide a theoretical explanation on this phenomenon through rate-distortion theory with a dimension-dependent bound, establishing fundamental limits on task mergeability regardless of methodology.

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

TL;DR

Abstract

Paper Structure (20 sections, 3 theorems, 7 equations, 1 figure, 10 tables)

This paper contains 20 sections, 3 theorems, 7 equations, 1 figure, 10 tables.

Introduction
PRELIMINARY
Empirical Investigation of Task-Level Model-Merging Collapse
Study Setup
RQ1: Model Merging Collapse
RQ2: Method vs. Task Dependence in Merging Collapse
RQ3: Correlative Factors for Merging Collapse
Theoretical Explanation
Hidden-state Distance Similarity.
Empirical results on Correlation when merging pairs of models
Empirical results on hiddensim for scaled merging.
Related Work
Development of LLMs
Model Merging
Conclusion
...and 5 more sections

Key Result

Theorem 1

Let $\{\,\theta_i\}_{i=1}^N\subset\mathbb R^{p}$ be $N$ fine-tuned minima of the same base network $F(\cdot;\theta)$, and let $h(x;\theta)\in\mathbb R^{d}$ be a fixed hidden layer. Assume linear mode connectivity (LMC): every convex combination $\sum_i\alpha_i\theta_i$ ($\alpha_i\!\ge0,\sum\alpha_i and for any candidate merged model $\hat{\theta}$ define the worst-case hidden-state distortion $\d

Figures (1)

Figure 1: Heatmap of similarity score of hidden states in GLUE tasks.

Theorems & Definitions (6)

Theorem 1: Hidden–State Diameter Controls Mergeability
proof : Sketch
Lemma 1: Jung's Theorem jung1901ueber
proof
proof : Proof of Theorem \ref{['thm:diameter']}
Corollary 1: Practical mergeability test

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

TL;DR

Abstract

An Empirical Study and Theoretical Explanation on Task-Level Model-Merging Collapse

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (6)