Table of Contents
Fetching ...

SeqRFM: Fast RFM Analysis in Sequence Data

Yanxin Zheng, Wensheng Gan, Zefeng Chen, Pinlyu Zhou, Philippe Fournier-Viger

TL;DR

An effective algorithm called SeqRFM is developed, which combines sequential pattern mining with RFM models, and identifies sequences with high recency, high frequency, and high monetary value.

Abstract

In recent years, data mining technologies have been well applied to many domains, including e-commerce. In customer relationship management (CRM), the RFM analysis model is one of the most effective approaches to increase the profits of major enterprises. However, with the rapid development of e-commerce, the diversity and abundance of e-commerce data pose a challenge to mining efficiency. Moreover, in actual market transactions, the chronological order of transactions reflects customer behavior and preferences. To address these challenges, we develop an effective algorithm called SeqRFM, which combines sequential pattern mining with RFM models. SeqRFM considers each customer's recency (R), frequency (F), and monetary (M) scores to represent the significance of the customer and identifies sequences with high recency, high frequency, and high monetary value. A series of experiments demonstrate the superiority and effectiveness of the SeqRFM algorithm compared to the most advanced RFM algorithms based on sequential pattern mining. The source code and datasets are available at GitHub https://github.com/DSI-Lab1/SeqRFM.

SeqRFM: Fast RFM Analysis in Sequence Data

TL;DR

An effective algorithm called SeqRFM is developed, which combines sequential pattern mining with RFM models, and identifies sequences with high recency, high frequency, and high monetary value.

Abstract

In recent years, data mining technologies have been well applied to many domains, including e-commerce. In customer relationship management (CRM), the RFM analysis model is one of the most effective approaches to increase the profits of major enterprises. However, with the rapid development of e-commerce, the diversity and abundance of e-commerce data pose a challenge to mining efficiency. Moreover, in actual market transactions, the chronological order of transactions reflects customer behavior and preferences. To address these challenges, we develop an effective algorithm called SeqRFM, which combines sequential pattern mining with RFM models. SeqRFM considers each customer's recency (R), frequency (F), and monetary (M) scores to represent the significance of the customer and identifies sequences with high recency, high frequency, and high monetary value. A series of experiments demonstrate the superiority and effectiveness of the SeqRFM algorithm compared to the most advanced RFM algorithms based on sequential pattern mining. The source code and datasets are available at GitHub https://github.com/DSI-Lab1/SeqRFM.

Paper Structure

This paper contains 18 sections, 3 theorems, 4 equations, 5 figures, 5 tables, 3 algorithms.

Key Result

Theorem 1

Let $Q'$ be a sub-sequence of a sequence $Q$, i.e. $Q' \sqsubseteq$$Q$$\wedge$$Q$$\sqsubseteq$$\mathcal{D}$. Then, the monetary value of $Q$ is always less than or equal to the sequence-weighted monetary value of $Q'$, that is M($Q'$) $\leq$SWM($Q'$). This property holds for any extension of $Q'$ as

Figures (5)

  • Figure 1: The occurrences of $\langle${$a$, $d$}$\rangle$ and $\langle${$a$, $d$}, {$f$}$\rangle$
  • Figure 2: MT-chain for sequence $\langle${$c$}$\rangle$.
  • Figure 3: MT-chain for sequence $\langle${$c$}, {$a$}$\rangle$.
  • Figure 4: Runtime in each dataset under different values of $\beta$.
  • Figure 5: Maximum memory usage in each dataset under different values of $\beta$.

Theorems & Definitions (22)

  • Definition 1: Item, itemset and sequence
  • Definition 2: Monetary and timestamp database
  • Definition 3: sub-sequence and remaining sequence wang2016efficiently
  • Definition 4: Matching
  • Definition 5: Instance wang2016efficiently
  • Definition 6: Monetary value
  • Definition 7: Recency value
  • Definition 8: Frequency value
  • Definition 9: Extension and extension item chen2023uucpm
  • Definition 10: Compactness constraint hu2013knowledge
  • ...and 12 more