G-Boost: Boosting Private SLMs with General LLMs
Yijiang Fan, Yuren Mao, Longbin Lai, Ying Zhang, Zhengping Qian, Yunjun Gao
TL;DR
The paper tackles the challenge of improving domain-specific performance for resource-constrained private SLMs by enabling adaptive collaboration with general LLMs. It introduces G-Boost, a framework that uses a Process Reward Model (PRM) to guide Monte Carlo Tree Search (MCTS) over a tree of reasoning steps, balancing two inference modes: SLM-LLM collaborative inference with logit fusion and private SLM inference. The approach is strengthened by a PRM-based evaluation that replaces rollouts, and by backpropagation to refine the search tree, enabling dynamic exploration of reasoning paths. Experiments on GSM8K and MATH-500, with MetaMathQA as private data, show that G-Boost consistently outperforms tuned private SLMs, general LLMs, and static collaborative methods, highlighting its practical potential for edge-cloud reasoning in domain-specific tasks.
Abstract
Due to the limited computational resources, most Large Language Models (LLMs) developers can only fine-tune Small Language Models (SLMs) on their own data. These private SLMs typically have limited effectiveness. To boost the performance of private SLMs, this paper proposes to ask general LLMs for help. The general LLMs can be APIs or larger LLMs whose inference cost the developers can afford. Specifically, we propose the G-Boost framework where a private SLM adaptively performs collaborative inference with a general LLM under the guide of process reward. Experiments demonstrate that our framework can significantly boost the performance of private SLMs.
