Table of Contents
Fetching ...

Explainability, risk modeling, and segmentation based customer churn analytics for personalized retention in e-commerce

Sanjula De Alwis, Indrajith Ekanayake

TL;DR

This work tackles e-commerce churn by proposing an integrated three-component framework that blends explainable AI, survival analysis, and RFM-based segmentation to support personalized retention. The predictive core uses XGBoost with TreeSHAP explanations to attribute churn risk to features, while time-to-churn is modeled via the Kaplan-Meier estimator $\widehat{S}(t)$ to capture censoring and duration. RFM-driven segmentation yields cohorts such as Best, Loyal, Lost, and New, enabling targeted, time-sensitive interventions. Together, these components provide actionable guidance on why churn occurs, when interventions are most effective, and which customer groups to prioritize for retention efforts.

Abstract

In online retail, customer acquisition typically incurs higher costs than customer retention, motivating firms to invest in churn analytics. However, many contemporary churn models operate as opaque black boxes, limiting insight into the determinants of attrition, the timing of retention opportunities, and the identification of high-risk customer segments. Accordingly, the emphasis should shift from prediction alone to the design of personalized retention strategies grounded in interpretable evidence. This study advances a three-component framework that integrates explainable AI to quantify feature contributions, survival analysis to model time-to-event churn risk, and RFM profiling to segment customers by transactional behaviour. In combination, these methods enable the attribution of churn drivers, estimation of intervention windows, and prioritization of segments for targeted actions, thereby supporting strategies that reduce attrition and strengthen customer loyalty.

Explainability, risk modeling, and segmentation based customer churn analytics for personalized retention in e-commerce

TL;DR

This work tackles e-commerce churn by proposing an integrated three-component framework that blends explainable AI, survival analysis, and RFM-based segmentation to support personalized retention. The predictive core uses XGBoost with TreeSHAP explanations to attribute churn risk to features, while time-to-churn is modeled via the Kaplan-Meier estimator to capture censoring and duration. RFM-driven segmentation yields cohorts such as Best, Loyal, Lost, and New, enabling targeted, time-sensitive interventions. Together, these components provide actionable guidance on why churn occurs, when interventions are most effective, and which customer groups to prioritize for retention efforts.

Abstract

In online retail, customer acquisition typically incurs higher costs than customer retention, motivating firms to invest in churn analytics. However, many contemporary churn models operate as opaque black boxes, limiting insight into the determinants of attrition, the timing of retention opportunities, and the identification of high-risk customer segments. Accordingly, the emphasis should shift from prediction alone to the design of personalized retention strategies grounded in interpretable evidence. This study advances a three-component framework that integrates explainable AI to quantify feature contributions, survival analysis to model time-to-event churn risk, and RFM profiling to segment customers by transactional behaviour. In combination, these methods enable the attribution of churn drivers, estimation of intervention windows, and prioritization of segments for targeted actions, thereby supporting strategies that reduce attrition and strengthen customer loyalty.

Paper Structure

This paper contains 10 sections, 2 equations, 3 figures, 1 table.

Figures (3)

  • Figure 1: SHAP analysis
  • Figure 2: Distribution of Recency, Frequency, and Monetary scores by RFM segments.
  • Figure 3: Kaplan-Meier survival curve illustrating customer retention over time.