Activation-Guided Consensus Merging for Large Language Models
Yuxuan Yao, Shuqi Liu, Zehua Liu, Qintong Li, Mingyang Liu, Xiongwei Han, Zhijiang Guo, Han Wu, Linqi Song
TL;DR
ACM tackles efficient, accurate merging of diverse LLMs by leveraging activation mutual information to compute layer-wise coefficients. It reveals that layer heterogeneity is crucial for preserving task-specific capabilities and can be exploited without gradient-based training. Across L2S and general merging tasks, ACM consistently improves reasoning accuracy while drastically reducing output length, as shown on Qwen-7B with TIES and other model families. The approach offers a practical, training-free path to robust, compact merged models with broad applicability.
Abstract
Recent research has increasingly focused on reconciling the reasoning capabilities of System 2 with the efficiency of System 1. While existing training-based and prompt-based approaches face significant challenges in terms of efficiency and stability, model merging emerges as a promising strategy to integrate the diverse capabilities of different Large Language Models (LLMs) into a unified model. However, conventional model merging methods often assume uniform importance across layers, overlooking the functional heterogeneity inherent in neural components. To address this limitation, we propose \textbf{A}ctivation-Guided \textbf{C}onsensus \textbf{M}erging (\textbf{ACM}), a plug-and-play merging framework that determines layer-specific merging coefficients based on mutual information between activations of pre-trained and fine-tuned models. ACM effectively preserves task-specific capabilities without requiring gradient computations or additional training. Extensive experiments on Long-to-Short (L2S) and general merging tasks demonstrate that ACM consistently outperforms all baseline methods. For instance, in the case of Qwen-7B models, TIES-Merging equipped with ACM achieves a \textbf{55.3\%} reduction in response length while simultaneously improving reasoning accuracy by \textbf{1.3} points.
