Table of Contents
Fetching ...

Estimating Individual Customer Lifetime Values with R: The CLVTools Package

Markus Meierer, Patrick Bachmann, Jeffrey Näf, Patrik Schilter, René Algesheimer

TL;DR

This work introduces CLVTools, an R package that unifies probabilistic models for estimating individual CLV in non-contractual settings, centering on Pareto/NBD for latent attrition and Gamma-Gamma for spending. It covers extensions for time-invariant and time-varying covariates, regularization, and covariate equality constraints, enabling robust, scalable, and interpretable CLV predictions. The paper details theoretical foundations, practical workflows (data prep, estimation, diagnostics, and prediction), and guidance for applying these models with covariates and uncertainty assessments. The approach supports both holdout-based evaluation and final predictions, including prospective customers, and offers advanced techniques for model comparison and hypothesis testing with a strong emphasis on actionable managerial insights.

Abstract

Customer lifetime value (CLV) describes a customer's long-term economic value for a business. This metric is widely used in marketing, for example, to select customers for a marketing campaign. However, modeling CLV is challenging. When relying on customers' purchase histories, the input data is sparse. Additionally, given its long-term focus, prediction horizons are often longer than estimation periods. Probabilistic models are able to overcome these challenges and, thus, are a popular option among researchers and practitioners. The latter also appreciate their applicability for both small and big data as well as their robust predictive performance without any fine-tuning requirements. Their popularity is due to three characteristics: data parsimony, scalability, and predictive accuracy. The R package CLVTools provides an efficient and user-friendly implementation framework to apply key probabilistic models such as the Pareto/NBD and Gamma-Gamma model. Further, it provides access to the latest model extensions to include time-invariant and time-varying covariates, parameter regularization, and equality constraints. This article gives an overview of the fundamental ideas of these statistical models and illustrates their application to derive CLV predictions for existing and new customers.

Estimating Individual Customer Lifetime Values with R: The CLVTools Package

TL;DR

This work introduces CLVTools, an R package that unifies probabilistic models for estimating individual CLV in non-contractual settings, centering on Pareto/NBD for latent attrition and Gamma-Gamma for spending. It covers extensions for time-invariant and time-varying covariates, regularization, and covariate equality constraints, enabling robust, scalable, and interpretable CLV predictions. The paper details theoretical foundations, practical workflows (data prep, estimation, diagnostics, and prediction), and guidance for applying these models with covariates and uncertainty assessments. The approach supports both holdout-based evaluation and final predictions, including prospective customers, and offers advanced techniques for model comparison and hypothesis testing with a strong emphasis on actionable managerial insights.

Abstract

Customer lifetime value (CLV) describes a customer's long-term economic value for a business. This metric is widely used in marketing, for example, to select customers for a marketing campaign. However, modeling CLV is challenging. When relying on customers' purchase histories, the input data is sparse. Additionally, given its long-term focus, prediction horizons are often longer than estimation periods. Probabilistic models are able to overcome these challenges and, thus, are a popular option among researchers and practitioners. The latter also appreciate their applicability for both small and big data as well as their robust predictive performance without any fine-tuning requirements. Their popularity is due to three characteristics: data parsimony, scalability, and predictive accuracy. The R package CLVTools provides an efficient and user-friendly implementation framework to apply key probabilistic models such as the Pareto/NBD and Gamma-Gamma model. Further, it provides access to the latest model extensions to include time-invariant and time-varying covariates, parameter regularization, and equality constraints. This article gives an overview of the fundamental ideas of these statistical models and illustrates their application to derive CLV predictions for existing and new customers.
Paper Structure (33 sections, 36 equations, 12 figures, 5 tables)

This paper contains 33 sections, 36 equations, 12 figures, 5 tables.

Figures (12)

  • Figure 1: Schematic transaction history for a prototypical customer
  • Figure 2: Schematic transaction history of a prototypical customer with time-varying covariates
  • Figure 3: General modeling structure -- Pareto/NBD model & Gamma-Gamma model (Gam refers to the Gamma distribution, and Exp refers to the Exponential distribution.)
  • Figure 4: Tracking plot for the apparel dataset
  • Figure 5: Timings plot for the apparel dataset
  • ...and 7 more figures