BeFA: A General Behavior-driven Feature Adapter for Multimedia Recommendation
Qile Fan, Penghang Yu, Zhiyi Tan, Bing-Kun Bao, Guanming Lu
TL;DR
The paper identifies that pre-trained content encoders can produce item features with information drift and omission, weakening user preference modeling in multimodal recommendations. It introduces a similarity-based attribution analysis to diagnose content-feature quality and BeFA, a plug-in, behavior-guided feature adapter that decouples, filters, and reconstructs content features using behavioral signals. BeFA is shown to consistently improve performance across multiple datasets, encoders, and baseline models with modest parameter overhead, indicating strong generalizability. Visualizations corroborate that BeFA focuses content features on relevant item details and reduces noise, suggesting practical impact for real-world multimodal recommendation systems.
Abstract
Multimedia recommender systems focus on utilizing behavioral information and content information to model user preferences. Typically, it employs pre-trained feature encoders to extract content features, then fuses them with behavioral features. However, pre-trained feature encoders often extract features from the entire content simultaneously, including excessive preference-irrelevant details. We speculate that it may result in the extracted features not containing sufficient features to accurately reflect user preferences. To verify our hypothesis, we introduce an attribution analysis method for visually and intuitively analyzing the content features. The results indicate that certain products' content features exhibit the issues of information drift}and information omission,reducing the expressive ability of features. Building upon this finding, we propose an effective and efficient general Behavior-driven Feature Adapter (BeFA) to tackle these issues. This adapter reconstructs the content feature with the guidance of behavioral information, enabling content features accurately reflecting user preferences. Extensive experiments demonstrate the effectiveness of the adapter across all multimedia recommendation methods. Our code is made publicly available on https://github.com/fqldom/BeFA.
