Reinforcement Learning applied to Insurance Portfolio Pursuit

Edward James Young; Alistair Rogers; Elliott Tong; James Jordon

Reinforcement Learning applied to Insurance Portfolio Pursuit

Edward James Young, Alistair Rogers, Elliott Tong, James Jordon

TL;DR

A novel reinforcement learning algorithm is devised for the portfolio pursuit problem, which is formulated as a sequential decision making problem, and which outperforms a baseline method which mimics current industry approaches to portfolio pursuit.

Abstract

When faced with a new customer, many factors contribute to an insurance firm's decision of what offer to make to that customer. In addition to the expected cost of providing the insurance, the firm must consider the other offers likely to be made to the customer, and how sensitive the customer is to differences in price. Moreover, firms often target a specific portfolio of customers that could depend on, e.g., age, location, and occupation. Given such a target portfolio, firms may choose to modulate an individual customer's offer based on whether the firm desires the customer within their portfolio. We term the problem of modulating offers to achieve a desired target portfolio the portfolio pursuit problem. Having formulated the portfolio pursuit problem as a sequential decision making problem, we devise a novel reinforcement learning algorithm for its solution. We test our method on a complex synthetic market environment, and demonstrate that it outperforms a baseline method which mimics current industry approaches to portfolio pursuit.

Reinforcement Learning applied to Insurance Portfolio Pursuit

TL;DR

Abstract

Paper Structure (14 sections, 21 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 14 sections, 21 equations, 1 figure, 1 table, 1 algorithm.

Introduction
Problem formulation
Portfolio pursuit as a Markov Decision Process
Specification of portfolio loss function
Methods
Standard industry methodology
Baseline methods for portfolio pursuit
A Reinforcement Learning approach to Portfolio Pursuit
Relationship to existing RL methods
Additional simulation details
Results
Discussion
Bellman recursions for portfolio values
Generalisation to customers leaving

Figures (1)

Figure 1: A side-by-side comparison of our method with an industry baseline. We plot the mean (with standard deviation error bars) of three quantities of interest over the testing epoch: (A-B) the profit $\sum_{k=1}^t y_k \mathrm{Profit}(s_k,a_k)$, (C-D) the loss between the current and target portfolio $\lambda \mathcal{L}(\rho_t, \rho^*)$, and (E-F) the difference between them. Left-hand panels (A,C,E) correspond to our method, Sec. \ref{['sec:RL for PP']}, and right-hand panels (B, D, F) correspond to the baseline method, Sec. \ref{['sec:baseline method']}. The black dashed line in each panel corresponds to the mean value at the terminal time-step. See Sec. \ref{['sec:additional simulation details']} for further information about our experimental set-up.

Reinforcement Learning applied to Insurance Portfolio Pursuit

TL;DR

Abstract

Reinforcement Learning applied to Insurance Portfolio Pursuit

Authors

TL;DR

Abstract

Table of Contents

Figures (1)