Cross-domain Recommender Systems via Multimodal Domain Adaptation
Adamya Shyam, Ramya Kamani, Venkateswara Rao Kagita, Vikas Kumar
TL;DR
The paper tackles data sparsity in collaborative filtering by proposing a cross-domain recommender system that uses multimodal information to align embeddings across domains. It introduces two domain adaptation variants, VDAR and FDAR, that fuse textual and visual features with latent interaction factors and train a domain classifier to align source and target embeddings in a semi-supervised setting. Extensive experiments on four Amazon domains show that FDAR consistently outperforms state-of-the-art baselines in both single-domain and cross-domain scenarios, with statistically significant improvements. This work demonstrates the value of combining multimodal representations and domain-adaptive alignment for robust recommendations in data-sparse environments and has potential to mitigate cold-start issues.
Abstract
Collaborative Filtering (CF) has emerged as one of the most prominent implementation strategies for building recommender systems. The key idea is to exploit the usage patterns of individuals to generate personalized recommendations. CF techniques, especially for newly launched platforms, often face a critical issue known as the data sparsity problem, which greatly limits their performance. Cross-domain CF alleviates the problem of data sparsity by finding a common set of entities (users or items) across the domains, which then act as a conduit for knowledge transfer. Nevertheless, most real-world datasets are collected from different domains, so they often lack information about anchor points or reference information for entity alignment. This paper introduces a domain adaptation technique to align the embeddings of entities across domains. Our approach first exploits the available textual and visual information to independently learn a multi-view latent representation for each entity in the auxiliary and target domains. The different representations of the entity are then fused to generate the corresponding unified representation. A domain classifier is then trained to learn the embedding for the domain alignment by fixing the unified features as the anchor points. Experiments on \AS{four} publicly available benchmark datasets indicate the effectiveness of our proposed approach.
