Training-free Graph-based Imputation of Missing Modalities in Multimodal Recommendation
Daniele Malitesta, Emanuele Rossi, Claudio Pomo, Tommaso Di Noia, Fragkiskos D. Malliaros
TL;DR
The paper addresses missing modalities in multimodal recommender systems by formalizing the problem and reframing it as graph feature interpolation on the item-item co-purchase graph. It introduces four training-free graph-aware imputations—NeighMean, MultiHop, PersPageRank, and Heat diffusion—to propagate available multimodal features across the graph, enabling pre-processing imputation before model training. Extensive experiments on six Amazon datasets and MicroLens across multiple baseline models show that graph-based imputations largely preserve or widen the gap between traditional and multimodal RSs and often outperform traditional and autoencoder-based imputations, with performance sensitive to hyperparameters like TopN and hop count. The work demonstrates practical effectiveness, provides public code, and identifies future directions for robust, end-to-end integration and handling of cold-start and noisy data in multimodal recommendation.
Abstract
Multimodal recommender systems (RSs) represent items in the catalog through multimodal data (e.g., product images and descriptions) that, in some cases, might be noisy or (even worse) missing. In those scenarios, the common practice is to drop items with missing modalities and train the multimodal RSs on a subsample of the original dataset. To date, the problem of missing modalities in multimodal recommendation has still received limited attention in the literature, lacking a precise formalisation as done with missing information in traditional machine learning. In this work, we first provide a problem formalisation for missing modalities in multimodal recommendation. Second, by leveraging the user-item graph structure, we re-cast the problem of missing multimodal information as a problem of graph features interpolation on the item-item co-purchase graph. On this basis, we propose four training-free approaches that propagate the available multimodal features throughout the item-item graph to impute the missing features. Extensive experiments on popular multimodal recommendation datasets demonstrate that our solutions can be seamlessly plugged into any existing multimodal RS and benchmarking framework while still preserving (or even widen) the performance gap between multimodal and traditional RSs. Moreover, we show that our graph-based techniques can perform better than traditional imputations in machine learning under different missing modalities settings. Finally, we analyse (for the first time in multimodal RSs) how feature homophily calculated on the item-item graph can influence our graph-based imputations.
