Table of Contents
Fetching ...

Generalizing Linear Autoencoder Recommenders with Decoupled Expected Quadratic Loss

Ruixin Guo, Xinyu Li, Hao Zhou, Yang Zhou, Ruoming Jin

TL;DR

Empirical results on benchmark datasets show that the $b>0$ solutions provided by DEQL outperform the $b = 0$ EDLAE baseline, demonstrating that DEQL expands the solution space and enables the discovery of models with better testing performance.

Abstract

Linear autoencoders (LAEs) have gained increasing popularity in recommender systems due to their simplicity and strong empirical performance. Most LAE models, including the Emphasized Denoising Linear Autoencoder (EDLAE) introduced by (Steck, 2020), use quadratic loss during training. However, the original EDLAE only provides closed-form solutions for the hyperparameter choice $b = 0$, which limits its capacity. In this work, we generalize EDLAE objective into a Decoupled Expected Quadratic Loss (DEQL). We show that DEQL simplifies the process of deriving EDLAE solutions and reveals solutions in a broader hyperparameter range $b > 0$, which were not derived in Steck's original paper. Additionally, we propose an efficient algorithm based on Miller's matrix inverse theorem to ensure the computational tractability for the $b > 0$ case. Empirical results on benchmark datasets show that the $b > 0$ solutions provided by DEQL outperform the $b = 0$ EDLAE baseline, demonstrating that DEQL expands the solution space and enables the discovery of models with better testing performance.

Generalizing Linear Autoencoder Recommenders with Decoupled Expected Quadratic Loss

TL;DR

Empirical results on benchmark datasets show that the solutions provided by DEQL outperform the EDLAE baseline, demonstrating that DEQL expands the solution space and enables the discovery of models with better testing performance.

Abstract

Linear autoencoders (LAEs) have gained increasing popularity in recommender systems due to their simplicity and strong empirical performance. Most LAE models, including the Emphasized Denoising Linear Autoencoder (EDLAE) introduced by (Steck, 2020), use quadratic loss during training. However, the original EDLAE only provides closed-form solutions for the hyperparameter choice , which limits its capacity. In this work, we generalize EDLAE objective into a Decoupled Expected Quadratic Loss (DEQL). We show that DEQL simplifies the process of deriving EDLAE solutions and reveals solutions in a broader hyperparameter range , which were not derived in Steck's original paper. Additionally, we propose an efficient algorithm based on Miller's matrix inverse theorem to ensure the computational tractability for the case. Empirical results on benchmark datasets show that the solutions provided by DEQL outperform the EDLAE baseline, demonstrating that DEQL expands the solution space and enables the discovery of models with better testing performance.
Paper Structure (28 sections, 7 theorems, 59 equations, 3 figures, 6 tables)

This paper contains 28 sections, 7 theorems, 59 equations, 3 figures, 6 tables.

Key Result

Lemma 3.2

The $H^{(i)}$ and $v^{(i)}$ in (eq:5) can be expressed as $H^{(i)} = G^{(i)}\odot R^TR$ and $v^{(i)} = u^{(i)} \odot R^TR_{*i}$, where $G^{(i)} \in \mathbb{R}^{n \times n}$ and $u^{(i)} \in \mathbb{R}^n$ satisfy for $k, l \in \{1, 2, ..., n\}$.

Figures (3)

  • Figure 1: Sensitivity on $b/a$ ratio across different datasets
  • Figure 2: Comparison of the closed-form solution sets of DEQL and EDLAE. The white region represents the set of original EDLAE solution, and the green region represents the remaining solutions covered by DEQL. The orange circle marks solutions derived from a original EDLAE loss, whereas the cyan circle marks solutions obtained from the extended EDLAE loss (with hyperparameter choices $b > a$, which go beyond the original EDLAE constraints but still yield valid solutions).
  • Figure 3: Diagonal Value Distribution Across different datasets

Theorems & Definitions (8)

  • Definition 3.1
  • Lemma 3.2
  • Theorem 3.3
  • Theorem 3.4
  • Proposition 3.5
  • Theorem 4.1
  • Corollary C.1
  • Theorem D.1