Synergistic Weak-Strong Collaboration by Aligning Preferences
Yizhu Jiao, Xuchao Zhang, Zhaoyang Wang, Yubo Ma, Zhun Deng, Rujia Wang, Chetan Bansal, Saravan Rajmohan, Jiawei Han, Huaxiu Yao
TL;DR
CoWest addresses the challenge of extending LLM problem-solving to domain-specific knowledge by pairing a domain-specialized weak model with a strong general model. Inference combines the weak draft with the strong model via collaborative refinement, and training uses a feedback loop where the weak model is aligned to the strong model's preferences through Direct Preference Optimization, yielding $y^* = \pi_s \circ ( x, \pi_w^* \circ x )$. Experiments on three domains show substantial gains over single-model baselines, with additional improvements when the weak model is aligned to collaborative preferences. Theoretical analysis demonstrates that preference alignment biases the weak model away from unhelpful outputs, offering a scalable path to deploying domain-specific reasoning without full fine-tuning of massive models. Overall, CoWest provides a practical framework to extend the capabilities of LLMs to niche tasks while mitigating privacy and scalability concerns.
Abstract
Current Large Language Models (LLMs) excel in general reasoning yet struggle with specialized tasks requiring proprietary or domain-specific knowledge. Fine-tuning large models for every niche application is often infeasible due to black-box constraints and high computational overhead. To address this, we propose a collaborative framework that pairs a specialized weak model with a general strong model. The weak model, tailored to specific domains, produces initial drafts and background information, while the strong model leverages its advanced reasoning to refine these drafts, extending LLMs' capabilities to critical yet specialized tasks. To optimize this collaboration, we introduce a collaborative feedback to fine-tunes the weak model, which quantifies the influence of the weak model's contributions in the collaboration procedure and establishes preference pairs to guide preference tuning of the weak model. We validate our framework through experiments on three domains. We find that the collaboration significantly outperforms each model alone by leveraging complementary strengths. Moreover, aligning the weak model with the collaborative preference further enhances overall performance.
