Table of Contents
Fetching ...

Reinforcement Learning Paycheck Optimization for Multivariate Financial Goals

Melda Alaluf, Giulia Crippa, Sinong Geng, Zijian Jing, Nikhil Krishnan, Sanjeev Kulkarni, Wyatt Navarro, Ronnie Sircar, Jonathan Tang

TL;DR

The paper tackles paycheck optimization by formulating it as a utility-maximization problem that unifies multiple heterogeneous financial goals and user preferences under stochastic rate dynamics. It adopts an end-to-end deep deterministic policy gradient method, with a policy $\pi(t)=f(X_t; \theta)$ and a value objective $V(\pi)=\sum_{t=0}^T\sum_{i\in I} u_i(X_t^{i,\pi})$, to learn income allocations without relying on parametric rate models. The contributions include a flexible state dynamics framework for debts, savings, and retirement, and a piecewise-linear utility structure that encodes urgency and priority across goals. Empirical results in both constant-rate and stochastic-rate settings demonstrate that the learned policies finish all goals while aligning with user preferences, and the work discusses explainability and future extensions such as portfolio optimization and offline RL.

Abstract

We study paycheck optimization, which examines how to allocate income in order to achieve several competing financial goals. For paycheck optimization, a quantitative methodology is missing, due to a lack of a suitable problem formulation. To deal with this issue, we formulate the problem as a utility maximization problem. The proposed formulation is able to (i) unify different financial goals; (ii) incorporate user preferences regarding the goals; (iii) handle stochastic interest rates. The proposed formulation also facilitates an end-to-end reinforcement learning solution, which is implemented on a variety of problem settings.

Reinforcement Learning Paycheck Optimization for Multivariate Financial Goals

TL;DR

The paper tackles paycheck optimization by formulating it as a utility-maximization problem that unifies multiple heterogeneous financial goals and user preferences under stochastic rate dynamics. It adopts an end-to-end deep deterministic policy gradient method, with a policy and a value objective , to learn income allocations without relying on parametric rate models. The contributions include a flexible state dynamics framework for debts, savings, and retirement, and a piecewise-linear utility structure that encodes urgency and priority across goals. Empirical results in both constant-rate and stochastic-rate settings demonstrate that the learned policies finish all goals while aligning with user preferences, and the work discusses explainability and future extensions such as portfolio optimization and offline RL.

Abstract

We study paycheck optimization, which examines how to allocate income in order to achieve several competing financial goals. For paycheck optimization, a quantitative methodology is missing, due to a lack of a suitable problem formulation. To deal with this issue, we formulate the problem as a utility maximization problem. The proposed formulation is able to (i) unify different financial goals; (ii) incorporate user preferences regarding the goals; (iii) handle stochastic interest rates. The proposed formulation also facilitates an end-to-end reinforcement learning solution, which is implemented on a variety of problem settings.
Paper Structure (16 sections, 14 equations, 3 figures, 2 tables, 1 algorithm)

This paper contains 16 sections, 14 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: Different utility functions
  • Figure 2: Contribution to each goal over time under the learned policy with constant rates for three different representative users: home buyer in blue, saver in orange, and debtor in green
  • Figure 3: Contribution to each goal over time under the learned policy with stochastic rates for three different representative users: home buyer in blue, saver in orange, and debtor in green. Note that the monthly income suffers a sharp increase after month $100$, since it is directly affected by inflation (which has hiked over the last couple of years).