EPR-GAIL: An EPR-Enhanced Hierarchical Imitation Learning Framework to Simulate Complex User Consumption Behaviors

Tao Feng; Yunke Zhang; Huandong Wang; Yong Li

EPR-GAIL: An EPR-Enhanced Hierarchical Imitation Learning Framework to Simulate Complex User Consumption Behaviors

Tao Feng, Yunke Zhang, Huandong Wang, Yong Li

TL;DR

The paper tackles the challenge of generating high-fidelity synthetic sequential user consumption data for applications like sales forecasting and store recommendation. It introduces EPR-GAIL, a framework that merges Exploration and Preferential Return (EPR) with Generative Adversarial Imitation Learning to model purchase, exploration, and preference as a hierarchical decision process. Key contributions include a decision-making feature extractor with self-attention and LSTM, a three-layer hierarchical policy, and a knowledge-enhanced reward that fuses a neural discriminator with an EPR-based discriminator using uncertainty weighting. Empirical results on two real-world Beijing and Guiyang datasets show substantial improvements in data fidelity (over 19% JS-divergence reduction) and downstream task performance (sales prediction up to 35.29% and location recommendation up to 11.19%), demonstrating practical impact for synthetic data generation in online retail settings.

Abstract

User consumption behavior data, which records individuals' online spending history at various types of stores, has been widely used in various applications, such as store recommendation, site selection, and sale forecasting. However, its high worth is limited due to deficiencies in data comprehensiveness and changes of application scenarios. Thus, generating high-quality sequential consumption data by simulating complex user consumption behaviors is of great importance to real-world applications. Two branches of existing sequence generation methods are both limited in quality. Model-based methods with simplified assumptions fail to model the complex decision process of user consumption, while data-driven methods that emulate real-world data are prone to noises, unobserved behaviors, and dynamic decision space. In this work, we propose to enhance the fidelity and trustworthiness of the data-driven Generative Adversarial Imitation Learning (GAIL) method by blending it with the Exploration and Preferential Return EPR model . The core idea of our EPR-GAIL framework is to model user consumption behaviors as a complex EPR decision process, which consists of purchase, exploration, and preference decisions. Specifically, we design the hierarchical policy function in the generator as a realization of the EPR decision process and employ the probability distributions of the EPR model to guide the reward function in the discriminator. Extensive experiments on two real-world datasets of user consumption behaviors on an online platform demonstrate that the EPR-GAIL framework outperforms the best state-of-the-art baseline by over 19\% in terms of data fidelity. Furthermore, the generated consumption behavior data can improve the performance of sale prediction and location recommendation by up to 35.29% and 11.19%, respectively, validating its advantage for practical applications.

EPR-GAIL: An EPR-Enhanced Hierarchical Imitation Learning Framework to Simulate Complex User Consumption Behaviors

TL;DR

Abstract

EPR-GAIL: An EPR-Enhanced Hierarchical Imitation Learning Framework to Simulate Complex User Consumption Behaviors

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)

Theorems & Definitions (1)