Table of Contents
Fetching ...

Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning

Sherly Alfonso-Sánchez, Jesús Solano, Alejandro Correa-Bahnsen, Kristina P. Sendova, Cristián Bravo

TL;DR

The paper tackles the problem of optimizing credit limit adjustments for revolving credit under adversarial goals by formulating it as an offline reinforcement learning task. It defines two actions $a\in\{0,1\}$ (increase or maintain) and builds an offline environment using a Latin American super-app dataset, employing a two-stage balance predictor and provisioning model that relies on $PD$, $LGD$, $EAD$, and $CCF$ to compute rewards. The study finds that a Double Q-learning agent trained on this offline simulator outperforms various baselines in synthetic trials, while alternative app data offer limited predictive gains; the $CCF$-based provisioning term is crucial for learning. The work demonstrates a practical, data-driven RL framework for credit limit management that banks and fintechs can adapt, including a deployment workflow for quarterly decisions and avenues for extending the action space and counterfactual analyses with human oversight.

Abstract

Reinforcement learning has been explored for many problems, from video games with deterministic environments to portfolio and operations management in which scenarios are stochastic; however, there have been few attempts to test these methods in banking problems. In this study, we sought to find and automatize an optimal credit card limit adjustment policy by employing reinforcement learning techniques. Because of the historical data available, we considered two possible actions per customer, namely increasing or maintaining an individual's current credit limit. To find this policy, we first formulated this decision-making question as an optimization problem in which the expected profit was maximized; therefore, we balanced two adversarial goals: maximizing the portfolio's revenue and minimizing the portfolio's provisions. Second, given the particularities of our problem, we used an offline learning strategy to simulate the impact of the action based on historical data from a super-app in Latin America to train our reinforcement learning agent. Our results, based on the proposed methodology involving synthetic experimentation, show that a Double Q-learning agent with optimized hyperparameters can outperform other strategies and generate a non-trivial optimal policy not only reflecting the complex nature of this decision but offering an incentive to explore reinforcement learning in real-world banking scenarios. Our research establishes a conceptual structure for applying reinforcement learning framework to credit limit adjustment, presenting an objective technique to make these decisions primarily based on data-driven methods rather than relying only on expert-driven systems. We also study the use of alternative data for the problem of balance prediction, as the latter is a requirement of our proposed model. We find the use of such data does not always bring prediction gains.

Optimizing Credit Limit Adjustments Under Adversarial Goals Using Reinforcement Learning

TL;DR

The paper tackles the problem of optimizing credit limit adjustments for revolving credit under adversarial goals by formulating it as an offline reinforcement learning task. It defines two actions (increase or maintain) and builds an offline environment using a Latin American super-app dataset, employing a two-stage balance predictor and provisioning model that relies on , , , and to compute rewards. The study finds that a Double Q-learning agent trained on this offline simulator outperforms various baselines in synthetic trials, while alternative app data offer limited predictive gains; the -based provisioning term is crucial for learning. The work demonstrates a practical, data-driven RL framework for credit limit management that banks and fintechs can adapt, including a deployment workflow for quarterly decisions and avenues for extending the action space and counterfactual analyses with human oversight.

Abstract

Reinforcement learning has been explored for many problems, from video games with deterministic environments to portfolio and operations management in which scenarios are stochastic; however, there have been few attempts to test these methods in banking problems. In this study, we sought to find and automatize an optimal credit card limit adjustment policy by employing reinforcement learning techniques. Because of the historical data available, we considered two possible actions per customer, namely increasing or maintaining an individual's current credit limit. To find this policy, we first formulated this decision-making question as an optimization problem in which the expected profit was maximized; therefore, we balanced two adversarial goals: maximizing the portfolio's revenue and minimizing the portfolio's provisions. Second, given the particularities of our problem, we used an offline learning strategy to simulate the impact of the action based on historical data from a super-app in Latin America to train our reinforcement learning agent. Our results, based on the proposed methodology involving synthetic experimentation, show that a Double Q-learning agent with optimized hyperparameters can outperform other strategies and generate a non-trivial optimal policy not only reflecting the complex nature of this decision but offering an incentive to explore reinforcement learning in real-world banking scenarios. Our research establishes a conceptual structure for applying reinforcement learning framework to credit limit adjustment, presenting an objective technique to make these decisions primarily based on data-driven methods rather than relying only on expert-driven systems. We also study the use of alternative data for the problem of balance prediction, as the latter is a requirement of our proposed model. We find the use of such data does not always bring prediction gains.
Paper Structure (20 sections, 9 equations, 8 figures, 4 tables)

This paper contains 20 sections, 9 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Global variable importance for the classifier.
  • Figure 2: Global variable importance for regressors.
  • Figure 3: Learning process.
  • Figure 4: Comparison of different strategies for adjusting credit limits.
  • Figure 5: Histograms monthly average utilization, payment, spending consumption rate, and current limit.
  • ...and 3 more figures