Reinforced Model Merging
Jiaqi Han, Jingwen Ye, Shunyu Liu, Haofei Zhang, Jie Song, Zunlei Feng, Mingli Song
TL;DR
This work tackles the challenge of efficiently merging multiple pre-trained models without gradient access by formulating the task as reinforcement learning. It introduces Reinforced Model Merging (RMM), featuring a merging agent and environment that perform layer-wise actions, optimized with PPO, and a Dynamic Average Reward (DAR) mechanism to dramatically reduce evaluation cost. DAR enables up to ~100× faster searches while achieving state-of-the-art performance on both vision and NLP benchmarks, including ViT and T5-based setups. The approach offers practical edge-device applicability and broad flexibility across merging algorithms and base models, with potential extensions to multi-modal and heterogeneous settings.
Abstract
The success of large language models has garnered widespread attention for model merging techniques, especially training-free methods which combine model capabilities within the parameter space. However, two challenges remain: (1) uniform treatment of all parameters leads to performance degradation; (2) search-based algorithms are often inefficient. In this paper, we present an innovative framework termed Reinforced Model Merging (RMM), which encompasses an environment and agent tailored for merging tasks. These components interact to execute layer-wise merging actions, aiming to search the optimal merging architecture. Notably, RMM operates without any gradient computations on the original models, rendering it feasible for edge devices. Furthermore, by utilizing data subsets during the evaluation process, we addressed the bottleneck in the reward feedback phase, thereby accelerating RMM by up to 100 times. Extensive experiments demonstrate that RMM achieves state-of-the-art performance across various vision and NLP datasets and effectively overcomes the limitations of the existing baseline methods. Our code is available at https://github.com/WuDiHJQ/Reinforced-Model-Merging.
