AdaS&S: a One-Shot Supernet Approach for Automatic Embedding Size Search in Deep Recommender System

He Wei; Yuekui Yang; Yang Zhang; Haiyang Wu; Meixi Liu; Shaoping Ma

AdaS&S: a One-Shot Supernet Approach for Automatic Embedding Size Search in Deep Recommender System

He Wei, Yuekui Yang, Yang Zhang, Haiyang Wu, Meixi Liu, Shaoping Ma

TL;DR

This work proposes a novel one-shot AES framework called AdaS&S, in which a supernet encompassing various candidate embeddings is built and AES is performed as searching network architectures within it, and introduces the resource competition penalty to balance the model effectiveness and memory cost of embeddings.

Abstract

Deep Learning Recommendation Model(DLRM)s utilize the embedding layer to represent various categorical features. Traditional DLRMs adopt unified embedding size for all features, leading to suboptimal performance and redundant parameters. Thus, lots of Automatic Embedding size Search (AES) works focus on obtaining mixed embedding sizes with strong model performance. However, previous AES works can hardly address several challenges together: (1) The search results of embedding sizes are unstable; (2) Recommendation effect with AES results is unsatisfactory; (3) Memory cost of embeddings is uncontrollable. To address these challenges, we propose a novel one-shot AES framework called AdaS&S, in which a supernet encompassing various candidate embeddings is built and AES is performed as searching network architectures within it. Our framework contains two main stages: In the first stage, we decouple training parameters from searching embedding sizes, and propose the Adaptive Sampling method to yield a well-trained supernet, which further helps to produce stable AES results. In the second stage, to obtain embedding sizes that benefits the model effect, we design a reinforcement learning search process which utilizes the supernet trained previously. Meanwhile, to adapt searching to specific resource constraint, we introduce the resource competition penalty to balance the model effectiveness and memory cost of embeddings. We conduct extensive experiments on public datasets to show the superiority of AdaS&S. Our method could improve AUC by about 0.3% while saving about 20% of model parameters. Empirical analysis also shows that the stability of searching results in AdaS&S significantly exceeds other methods.

AdaS&S: a One-Shot Supernet Approach for Automatic Embedding Size Search in Deep Recommender System

TL;DR

Abstract

AdaS&S: a One-Shot Supernet Approach for Automatic Embedding Size Search in Deep Recommender System

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)