Table of Contents
Fetching ...

RecSys Challenge 2023: From data preparation to prediction, a simple, efficient, robust and scalable solution

Maxime Manderlier, Fabian Lecron

TL;DR

The paper tackles predicting app installs from ad impressions in ShareChat and Moj, using a dataset with categorical, binary, and numerical features. It proposes a compact neural network with embedding-based handling of categorical data, regression-based missing-value imputation, and min-max normalization, exploring both single-output and two-output architectures. The approach achieves a best challenge Log-Loss of 6.622686 and demonstrates that a small, robust model scales with data while remaining production-friendly. It also analyzes the trade-offs between single- and multi-output configurations and emphasizes practical deployment considerations for online advertising contexts.

Abstract

The RecSys Challenge 2023, presented by ShareChat, consists to predict if an user will install an application on his smartphone after having seen advertising impressions in ShareChat & Moj apps. This paper presents the solution of 'Team UMONS' to this challenge, giving accurate results (our best score is 6.622686) with a relatively small model that can be easily implemented in different production configurations. Our solution scales well when increasing the dataset size and can be used with datasets containing missing values.

RecSys Challenge 2023: From data preparation to prediction, a simple, efficient, robust and scalable solution

TL;DR

The paper tackles predicting app installs from ad impressions in ShareChat and Moj, using a dataset with categorical, binary, and numerical features. It proposes a compact neural network with embedding-based handling of categorical data, regression-based missing-value imputation, and min-max normalization, exploring both single-output and two-output architectures. The approach achieves a best challenge Log-Loss of 6.622686 and demonstrates that a small, robust model scales with data while remaining production-friendly. It also analyzes the trade-offs between single- and multi-output configurations and emphasizes practical deployment considerations for online advertising contexts.

Abstract

The RecSys Challenge 2023, presented by ShareChat, consists to predict if an user will install an application on his smartphone after having seen advertising impressions in ShareChat & Moj apps. This paper presents the solution of 'Team UMONS' to this challenge, giving accurate results (our best score is 6.622686) with a relatively small model that can be easily implemented in different production configurations. Our solution scales well when increasing the dataset size and can be used with datasets containing missing values.
Paper Structure (23 sections, 2 figures, 4 tables)

This paper contains 23 sections, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Predict 'is_installed'
  • Figure 2: Predict 'is_clicked' and 'is_installed'