Dataset-Agnostic Recommender Systems

Tri Kurniawan Wijaya; Edoardo D'Amico; Xinyang Shao

Dataset-Agnostic Recommender Systems

Tri Kurniawan Wijaya, Edoardo D'Amico, Xinyang Shao

TL;DR

Dataset-Agnostic Recommender Systems (DAReS) address the challenge of dataset-specific tuning in recommender pipelines by introducing a fully dataset-agnostic framework built around a Dataset Description Language ($DsDL$). The approach automates feature engineering, model selection, and hyperparameter tuning to enable zero-configuration deployment across diverse datasets, reducing the need for domain expertise. Key contributions include the formal $DsDL$ schema, the definition of level-1 and level-2 automation, and a comparative view against traditional recommender systems. While DAReS enhances reproducibility and accessibility, it trades off some task-specific customization and incurs higher computational overhead, with future work targeting fully autonomous Level-2 operation through richer metadata and generalized strategies for model and feature selection.

Abstract

Recommender systems have become a cornerstone of personalized user experiences, yet their development typically involves significant manual intervention, including dataset-specific feature engineering, hyperparameter tuning, and configuration. To this end, we introduce a novel paradigm: Dataset-Agnostic Recommender Systems (DAReS) that aims to enable a single codebase to autonomously adapt to various datasets without the need for fine-tuning, for a given recommender system task. Central to this approach is the Dataset Description Language (DsDL), a structured format that provides metadata about the dataset's features and labels, and allow the system to understand dataset's characteristics, allowing it to autonomously manage processes like feature selection, missing values imputation, noise removal, and hyperparameter optimization. By reducing the need for domain-specific expertise and manual adjustments, DAReS offers a more efficient and scalable solution for building recommender systems across diverse application domains. It addresses critical challenges in the field, such as reusability, reproducibility, and accessibility for non-expert users or entry-level researchers.

Dataset-Agnostic Recommender Systems

TL;DR

). The approach automates feature engineering, model selection, and hyperparameter tuning to enable zero-configuration deployment across diverse datasets, reducing the need for domain expertise. Key contributions include the formal

schema, the definition of level-1 and level-2 automation, and a comparative view against traditional recommender systems. While DAReS enhances reproducibility and accessibility, it trades off some task-specific customization and incurs higher computational overhead, with future work targeting fully autonomous Level-2 operation through richer metadata and generalized strategies for model and feature selection.

Abstract

Paper Structure (11 sections, 1 table)

This paper contains 11 sections, 1 table.

Introduction
Dataset-Agnostic Recommender Systems (DAReS)
Dataset Description Language (DsDL)
Autonomous Feature Engineering and Preprocessing
Model Selection and Hyperparameter Tuning
Model Evaluation
Comparison to Traditional Recommender Systems
Automation Levels of Recommender Systems
Level-1 Automation: Dataset-Agnostic but Task-Specific
Level-2 Automation: Task-Agnostic and Dataset-Agnostic
Conclusion and Future Work

Dataset-Agnostic Recommender Systems

TL;DR

Abstract

Dataset-Agnostic Recommender Systems

Authors

TL;DR

Abstract

Table of Contents