DOS: Dual-Flow Orthogonal Semantic IDs for Recommendation in Meituan
Junwei Yin, Senjie Kou, Changhao Li, Shuli Wang, Xue Wei, Yinqiu Huang, Yinhua Zhu, Haitao Wang, Xingxing Wang
TL;DR
The paper tackles the scalability and quality gaps in generative recommendation by introducing DOS, a framework that learns context-aware Semantic IDs aligned with the downstream generation space. DOS combines Dual-Flow Integration to fuse explicit user-item interactions with a shared codebook and Orthogonal Residual Quantization to rotate the semantic space and preserve task-relevant semantics across layers. The approach yields superior offline metrics (AUC, F1) and Hit@10 for next-token prediction, and shows real-world impact with a 1.15% online revenue uplift in Meituan’s A/B test. These results establish a scalable, deployment-ready pathway for SID-based generation in large-scale industrial settings.
Abstract
Semantic IDs serve as a key component in generative recommendation systems. They not only incorporate open-world knowledge from large language models (LLMs) but also compress the semantic space to reduce generation difficulty. However, existing methods suffer from two major limitations: (1) the lack of contextual awareness in generation tasks leads to a gap between the Semantic ID codebook space and the generation space, resulting in suboptimal recommendations; and (2) suboptimal quantization methods exacerbate semantic loss in LLMs. To address these issues, we propose Dual-Flow Orthogonal Semantic IDs (DOS) method. Specifically, DOS employs a user-item dual flow-framework that leverages collaborative signals to align the Semantic ID codebook space with the generation space. Furthermore, we introduce an orthogonal residual quantization scheme that rotates the semantic space to an appropriate orientation, thereby maximizing semantic preservation. Extensive offline experiments and online A/B testing demonstrate the effectiveness of DOS. The proposed method has been successfully deployed in Meituan's mobile application, serving hundreds of millions of users.
