Catalog-Native LLM: Speaking Item-ID Dialect with Less Entanglement for Recommendation

Reza Shirkavand; Xiaokai Wei; Chen Wang; Zheng Hui; Heng Huang; Michelle Gong

Catalog-Native LLM: Speaking Item-ID Dialect with Less Entanglement for Recommendation

Reza Shirkavand, Xiaokai Wei, Chen Wang, Zheng Hui, Heng Huang, Michelle Gong

TL;DR

The paper addresses the challenge of unifying collaborative filtering signals with language-based reasoning in recommender systems without degrading language understanding. It introduces IDIOMoE, a dual-expert Mixture-of-Experts model that treats Item-IDs as a native dialect and routes item tokens to an item expert while text tokens go to a text expert, all within a shared Transformer backbone. Through extensive ablations and a novel FFN key-value memory analysis, the authors demonstrate that expert specialization and fixed token-type routing reduce semantic–collaborative interference, yielding superior performance on public Amazon catalogs and a large industry dataset while preserving linguistic capabilities. The work highlights the value of disentangled modalities for scalable, explainable recommendations and suggests a path toward sustainable, modular LLM-based recommender systems with strong practical impact.

Abstract

While collaborative filtering delivers predictive accuracy and efficiency, and Large Language Models (LLMs) enable expressive and generalizable reasoning, modern recommendation systems must bring these strengths together. Growing user expectations, such as natural-language queries and transparent explanations, further highlight the need for a unified approach. However, doing so is nontrivial. Collaborative signals are often token-efficient but semantically opaque, while LLMs are semantically rich but struggle to model implicit user preferences when trained only on textual inputs. This paper introduces Item-ID + Oral-language Mixture-of-Experts Language Model (IDIOMoE), which treats item interaction histories as a native dialect within the language space, enabling collaborative signals to be understood in the same way as natural language. By splitting the Feed Forward Network of each block of a pretrained LLM into a separate text expert and an item expert with token-type gating, our method avoids destructive interference between text and catalog modalities. IDIOMoE demonstrates strong recommendation performance across both public and proprietary datasets, while preserving the text understanding of the pretrained model.

Catalog-Native LLM: Speaking Item-ID Dialect with Less Entanglement for Recommendation

TL;DR

Abstract

Catalog-Native LLM: Speaking Item-ID Dialect with Less Entanglement for Recommendation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)