Table of Contents
Fetching ...

Merge and Guide: Unifying Model Merging and Guided Decoding for Controllable Multi-Objective Generation

Guofu Xie, Chen Zhang, Xiao Zhang, Yunsheng Shi, Ting Yao, Jun Xu

TL;DR

This work introduces Merge-And-GuidE (MAGE), a two-stage framework that leverages model merging for guided decoding and outperforms existing approaches, achieving superior controllability, Pareto-optimal performance, and enhanced adaptability.

Abstract

Adapting to diverse user needs at test time is a key challenge in controllable multi-objective generation. Existing methods are insufficient: merging-based approaches provide indirect, suboptimal control at the parameter level, often disregarding the impacts of multiple objectives. While decoding-based guidance is more direct, it typically requires aggregating logits from multiple expert models, incurring significant space overhead and relying heavily on individual model capacity. To address these issues, we introduce Merge-And-GuidE (MAGE), a two-stage framework that leverages model merging for guided decoding. We first identify a critical compatibility problem between the guidance and base models. In Stage 1, MAGE resolves this by dynamically constructing a more robust base model, merging a series of backbone models that account for multiple objectives. In Stage 2, we merge explicit and implicit value models into a unified guidance proxy, which then steers the decoding of the base model from Stage 1. Our analysis empirically validates Linear Mode Connectivity (LMC) in value models, explores the relationship between model merging and prediction ensembling, and demonstrates the enhanced controllability afforded by our approach. Extensive experiments show that our method outperforms existing approaches, achieving superior controllability, Pareto-optimal performance, and enhanced adaptability.

Merge and Guide: Unifying Model Merging and Guided Decoding for Controllable Multi-Objective Generation

TL;DR

This work introduces Merge-And-GuidE (MAGE), a two-stage framework that leverages model merging for guided decoding and outperforms existing approaches, achieving superior controllability, Pareto-optimal performance, and enhanced adaptability.

Abstract

Adapting to diverse user needs at test time is a key challenge in controllable multi-objective generation. Existing methods are insufficient: merging-based approaches provide indirect, suboptimal control at the parameter level, often disregarding the impacts of multiple objectives. While decoding-based guidance is more direct, it typically requires aggregating logits from multiple expert models, incurring significant space overhead and relying heavily on individual model capacity. To address these issues, we introduce Merge-And-GuidE (MAGE), a two-stage framework that leverages model merging for guided decoding. We first identify a critical compatibility problem between the guidance and base models. In Stage 1, MAGE resolves this by dynamically constructing a more robust base model, merging a series of backbone models that account for multiple objectives. In Stage 2, we merge explicit and implicit value models into a unified guidance proxy, which then steers the decoding of the base model from Stage 1. Our analysis empirically validates Linear Mode Connectivity (LMC) in value models, explores the relationship between model merging and prediction ensembling, and demonstrates the enhanced controllability afforded by our approach. Extensive experiments show that our method outperforms existing approaches, achieving superior controllability, Pareto-optimal performance, and enhanced adaptability.

Paper Structure

This paper contains 78 sections, 1 theorem, 45 equations, 18 figures, 5 tables.

Key Result

theorem 1

Given quadratic reward functions with Hessians proportional to identity matrices: where $k_i \in \mathbb{R}_+$ are distinct,and $\theta_i$ is the global maximum for reward $r_i$. Let the reward combination weight matrix be $\bm B = ,\beta\in(1/2, 1)$, then the backbone rewards of the Bone Soup approach can be denoted as $(h_1,h_2)^\top = \bm B (r_1,r_2)^\top$. Let $\bm{\mu} = (\m with interval le

Figures (18)

  • Figure 1: An overview of our two-stage MAGE framework. Stage 1 (left): Dynamic Policy Model Construction. We call our Stage 1 method as Bone Soup. We first construct a set of combined "Backbone Rewards" from single-objective reward models (RM). These new rewards are used to train a diverse set of backbone models. Based on the user's preference $\mu$, we then determine the merging coefficients $\lambda$ to "soup" these backbones into a single, tailored base model. Stage 2 (right): Unified Guidance via Value Model Merging. We merge multiple single-objective value models into a unified guidance model based on $\mu$. This merged model then steers the base model's decoding process by re-weighting its output distribution, leading to a generation that better aligns with the user's preferences and achieves a superior Pareto front.
  • Figure 2: Visualizing the Compatibility Challenge: Static vs. Dynamic Base Models. The plot compares the Pareto fronts on the Faithful vs. Summary trade-off. Methods employing a static base model (e.g., 'SFT+(E)', where a single SFT model is guided across various strengths) are confined to small, suboptimal clusters. This demonstrates their inability to adapt and explore the solution space effectively, even with guidance. In contrast, methods with a dynamic base model ("Bone Soup", "MAGE" variants) trace a dominant and expansive Pareto front, showing their superior adaptability to diverse user preferences.
  • Figure 3: Motivation for Bone Soup. (a) A comparison of the soup-like (RS) yang2024rewards and our Bone Soup (BS) fronts for Example 2.1. Our BS approach first seeks the backbone models. The heatmap indicates the magnitude of the testing reward as a function of two inputs $x$ and $y$. As shown in the figure, the points on the BS front are closer to the exact solution, highlighting the importance of constructing backbone models. (b)The solutions for the same preference across different methods are connected by blue lines. For each line, the closer the solution is to the red point (oracle), the better the result. Many of the yellow points in the middle are almost overlapping with the red point, indicating better solutions compared to the blue points further away. This highlights the importance of using backbone rewards to construct the backbone model.
  • Figure 4: Empirical validation of the Linear Mode Connectivity (LMC) property in value models. Across three trade-off settings --- (a) "helpful vs. humor", (b) "helpful vs. harmless", and (c) "faithful vs. summary" --- the reward curves generated by merged value models (solid lines with markers) are consistently concave. They lie above the direct linear interpolation (dashed lines) between the two endpoint models, confirming the LMC property described in Observation \ref{['obs:lmc']}.
  • Figure 5: Model merging as an efficient proxy for prediction ensembling. The figure compares the Pareto frontiers generated by a single merged value model ("Merge") against a computationally expensive ensemble of value models ("Prediction Ensemble"). Across the three trade-offs, the performance is highly comparable, with the merged model's frontier often closely matching or even Pareto-dominating the ensemble's. This supports Observation \ref{['obs:merge_ens']}, highlighting merging as a high-fidelity and efficient alternative.
  • ...and 13 more figures

Theorems & Definitions (3)

  • theorem 1
  • proof : Proof of Theorem \ref{['thm:bonesoup:journal:reward']}
  • definition 1: Controllability Score