Table of Contents
Fetching ...

Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in Recommendation

Matteo Attimonelli, Danilo Danese, Daniele Malitesta, Claudio Pomo, Giuseppe Gassi, Tommaso Di Noia

TL;DR

Ducho 2.0 tackles the lack of standardization in multimodal feature extraction for recommender systems by delivering a highly configurable framework that supports custom extractor models, advanced preprocessing, and multimodal-by-design backbones like CLIP. It emphasizes usability and performance through PyTorch dataloaders and broad backend support, enabling seamless fusion of visual and textual modalities. The paper demonstrates an end-to-end multimodal benchmarking pipeline on Amazon Baby, comparing traditional visual/textual extractors with modern multimodal models across VBPR, BM3, and FREEDOM recommenders, and showing the framework's utility for extensive benchmarking. By enabling end-users to run reproducible experiments and easily incorporate new models, Ducho 2.0 aims to accelerate research and practical deployment in multimodal recommendation settings.

Abstract

In this work, we introduce Ducho 2.0, the latest stable version of our framework. Differently from Ducho, Ducho 2.0 offers a more personalized user experience with the definition and import of custom extraction models fine-tuned on specific tasks and datasets. Moreover, the new version is capable of extracting and processing features through multimodal-by-design large models. Notably, all these new features are supported by optimized data loading and storing to the local memory. To showcase the capabilities of Ducho 2.0, we demonstrate a complete multimodal recommendation pipeline, from the extraction/processing to the final recommendation. The idea is to provide practitioners and experienced scholars with a ready-to-use tool that, put on top of any multimodal recommendation framework, may permit them to run extensive benchmarking analyses. All materials are accessible at: \url{https://github.com/sisinflab/Ducho}.

Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in Recommendation

TL;DR

Ducho 2.0 tackles the lack of standardization in multimodal feature extraction for recommender systems by delivering a highly configurable framework that supports custom extractor models, advanced preprocessing, and multimodal-by-design backbones like CLIP. It emphasizes usability and performance through PyTorch dataloaders and broad backend support, enabling seamless fusion of visual and textual modalities. The paper demonstrates an end-to-end multimodal benchmarking pipeline on Amazon Baby, comparing traditional visual/textual extractors with modern multimodal models across VBPR, BM3, and FREEDOM recommenders, and showing the framework's utility for extensive benchmarking. By enabling end-users to run reproducible experiments and easily incorporate new models, Ducho 2.0 aims to accelerate research and practical deployment in multimodal recommendation settings.

Abstract

In this work, we introduce Ducho 2.0, the latest stable version of our framework. Differently from Ducho, Ducho 2.0 offers a more personalized user experience with the definition and import of custom extraction models fine-tuned on specific tasks and datasets. Moreover, the new version is capable of extracting and processing features through multimodal-by-design large models. Notably, all these new features are supported by optimized data loading and storing to the local memory. To showcase the capabilities of Ducho 2.0, we demonstrate a complete multimodal recommendation pipeline, from the extraction/processing to the final recommendation. The idea is to provide practitioners and experienced scholars with a ready-to-use tool that, put on top of any multimodal recommendation framework, may permit them to run extensive benchmarking analyses. All materials are accessible at: \url{https://github.com/sisinflab/Ducho}.
Paper Structure (12 sections, 2 figures, 1 table)

This paper contains 12 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: An overview of Ducho 2.0, where newly-introduced functionalities have hatch background.
  • Figure 2: A toy example with the YAML configuration. New features for Ducho 2.0 are highlighted in green.