An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing
Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang
TL;DR
This work introduces Expert-Token-Routing (ETR), a unified generalist framework that seamlessly incorporates multiple expert LLMs by encoding each expert as a special token in a frozen meta LLM. Routing to an expert is performed as part of standard next-token prediction, with expert token embeddings learned from an expert query set so that the meta LLM learns when to delegate to the right specialist. Training updates only the expert token embeddings, enabling plug-in extension of new experts without retraining the backbone model. Across six expert domains on the MMLU-Expert benchmark, ETR achieves higher overall accuracy and expert routing accuracy than prompting-based and router-based baselines, while maintaining user-facing simplicity and showing robustness to dynamic extension with minimal performance loss. The results demonstrate the practicality of building scalable, real-time multi-expert systems by unifying expert knowledge under a single, generalist interface.
Abstract
We present Expert-Token-Routing, a unified generalist framework that facilitates seamless integration of multiple expert LLMs. Our framework represents expert LLMs as special expert tokens within the vocabulary of a meta LLM. The meta LLM can route to an expert LLM like generating new tokens. Expert-Token-Routing not only supports learning the implicit expertise of expert LLMs from existing instruction dataset but also allows for dynamic extension of new expert LLMs in a plug-and-play manner. It also conceals the detailed collaboration process from the user's perspective, facilitating interaction as though it were a singular LLM. Our framework outperforms various existing multi-LLM collaboration paradigms across benchmarks that incorporate six diverse expert domains, demonstrating effectiveness and robustness in building generalist LLM system via synergizing multiple expert LLMs.
