Table of Contents
Fetching ...

A Utility-Mining-Driven Active Learning Approach for Analyzing Clickstream Sequences

Danny Y. C. Wang, Lars Arne Jordanger, Jerry Chun-Wei Lin

TL;DR

The model is able to refine the processing of e-commerce data and lead to optimized, cost-efficient prediction modeling and the model’s ability to reduce labeling requirements while maintaining high predictive performance is highlighted.

Abstract

In rapidly evolving e-commerce industry, the capability of selecting high-quality data for model training is essential. This study introduces the High-Utility Sequential Pattern Mining using SHAP values (HUSPM-SHAP) model, a utility mining-based active learning strategy to tackle this challenge. We found that the parameter settings for positive and negative SHAP values impact the model's mining outcomes, introducing a key consideration into the active learning framework. Through extensive experiments aimed at predicting behaviors that do lead to purchases or not, the designed HUSPM-SHAP model demonstrates its superiority across diverse scenarios. The model's ability to mitigate labeling needs while maintaining high predictive performance is highlighted. Our findings demonstrate the model's capability to refine e-commerce data processing, steering towards more streamlined, cost-effective prediction modeling.

A Utility-Mining-Driven Active Learning Approach for Analyzing Clickstream Sequences

TL;DR

The model is able to refine the processing of e-commerce data and lead to optimized, cost-efficient prediction modeling and the model’s ability to reduce labeling requirements while maintaining high predictive performance is highlighted.

Abstract

In rapidly evolving e-commerce industry, the capability of selecting high-quality data for model training is essential. This study introduces the High-Utility Sequential Pattern Mining using SHAP values (HUSPM-SHAP) model, a utility mining-based active learning strategy to tackle this challenge. We found that the parameter settings for positive and negative SHAP values impact the model's mining outcomes, introducing a key consideration into the active learning framework. Through extensive experiments aimed at predicting behaviors that do lead to purchases or not, the designed HUSPM-SHAP model demonstrates its superiority across diverse scenarios. The model's ability to mitigate labeling needs while maintaining high predictive performance is highlighted. Our findings demonstrate the model's capability to refine e-commerce data processing, steering towards more streamlined, cost-effective prediction modeling.

Paper Structure

This paper contains 15 sections, 2 figures, 8 tables, 1 algorithm.

Figures (2)

  • Figure 1: A high-level overview of the clickstream analysis workflow employed in this study
  • Figure 2: A SHAP force plot example showing the individual clickstream sequence contribution