Table of Contents
Fetching ...

Reinforcement Learning in Credit Scoring and Underwriting

Seksan Kiatsupaibul, Pakawan Chansiripas, Pojtanut Manopanjasiri, Kantapong Visantavarakul, Zheng Wen

TL;DR

Addressing ungeneralizable contextual credit scoring, the paper reframes credit underwriting as an RL bandit with action-space renewal and multiple-choice actions. It introduces three agents—GRE, THS, and IDS—based on a logistic regression backbone and Bayesian updating; THS and IDS balance exploration with exploitation to outperform the GRE baseline when the data align with the model. Experiments across logistic-regression and neural-network data-generating processes, in SF and MR settings, show that IDS often yields the best performance and diversification, while neural-network misalignment can erode gains and sometimes favor GRE. The work emphasizes that powerful learning models are essential for RL-based underwriting and motivates future research on neural-network-informed exploration strategies.

Abstract

This paper proposes a novel reinforcement learning (RL) framework for credit underwriting that tackles ungeneralizable contextual challenges. We adapt RL principles for credit scoring, incorporating action space renewal and multi-choice actions. Our work demonstrates that the traditional underwriting approach aligns with the RL greedy strategy. We introduce two new RL-based credit underwriting algorithms to enable more informed decision-making. Simulations show these new approaches outperform the traditional method in scenarios where the data aligns with the model. However, complex situations highlight model limitations, emphasizing the importance of powerful machine learning models for optimal performance. Future research directions include exploring more sophisticated models alongside efficient exploration mechanisms.

Reinforcement Learning in Credit Scoring and Underwriting

TL;DR

Addressing ungeneralizable contextual credit scoring, the paper reframes credit underwriting as an RL bandit with action-space renewal and multiple-choice actions. It introduces three agents—GRE, THS, and IDS—based on a logistic regression backbone and Bayesian updating; THS and IDS balance exploration with exploitation to outperform the GRE baseline when the data align with the model. Experiments across logistic-regression and neural-network data-generating processes, in SF and MR settings, show that IDS often yields the best performance and diversification, while neural-network misalignment can erode gains and sometimes favor GRE. The work emphasizes that powerful learning models are essential for RL-based underwriting and motivates future research on neural-network-informed exploration strategies.

Abstract

This paper proposes a novel reinforcement learning (RL) framework for credit underwriting that tackles ungeneralizable contextual challenges. We adapt RL principles for credit scoring, incorporating action space renewal and multi-choice actions. Our work demonstrates that the traditional underwriting approach aligns with the RL greedy strategy. We introduce two new RL-based credit underwriting algorithms to enable more informed decision-making. Simulations show these new approaches outperform the traditional method in scenarios where the data aligns with the model. However, complex situations highlight model limitations, emphasizing the importance of powerful machine learning models for optimal performance. Future research directions include exploring more sophisticated models alongside efficient exploration mechanisms.
Paper Structure (7 sections, 10 equations, 11 figures, 1 algorithm)

This paper contains 7 sections, 10 equations, 11 figures, 1 algorithm.

Figures (11)

  • Figure 1: Reinforcement learning components and their credit underwriting counterparts.
  • Figure 2: The agents' performances under 2 scenarios within the single context environments. Panel (a) and (b) exhibit the cumulative regret curves under the SF and MR scenarios, respectively.
  • Figure 3: The distributions of the default rates by contexts that are simulated under the environment derived from the logistic regression data generation model. Panel (a) illustrates the distributions by the cumulative distribution function. Panel (b) summarizes the distributions in terms of box plots.
  • Figure 4: The agents' performances with respect to SF scenario (single underwriting, fixed applicant pool) within the logistic regression environment. Panel (a) and (b) exhibit the regret and the cumulative regret curves. Panel (c) and (d) exhibit the reward and cumulative reward curves.
  • Figure 5: The agents' performances with respect to MR scenario (multiple underwriting, renewed applicant pool) within the logistic regression environment. Panel (a) and (b) exhibit the regret and the cumulative regret curves. Panel (c) and (d) exhibit the reward and cumulative reward curves.
  • ...and 6 more figures