Table of Contents
Fetching ...

Multi-Task Dynamic Pricing in Credit Market with Contextual Information

Adel Javanmard, Jingwei Ji, Renyuan Xu

TL;DR

This paper develops a dynamic pricing framework for a large set of illiquid credit-market securities by exploiting latent structure across securities. It introduces the Two-Stage Multi-Task (TSMT) pricing policy, which first pools data to learn a common parameter and then refines per-security parameters with regularization, enabling automatic adaptation to inter-security similarity without prior knowledge of the similarity level. The authors prove a sublinear regret bound that scales as approximately δ_max sqrt(TMD) plus an additive term Md, and they validate the approach with both synthetic simulations and real corporate-bond data, showing improvements over fully pooled or fully individual benchmarks. The work bridges online dynamic pricing, multi-task learning, and credit-market pricing under censored feedback, offering a scalable, data-efficient method for real-time quote generation in OTC markets. Its practical impact lies in more accurate, competitive pricing across a broad security universe in settings with limited data and regulatory information constraints.

Abstract

We study the dynamic pricing problem faced by a broker seeking to learn prices for a large number of credit market securities, such as corporate bonds, government bonds, loans, and other credit-related securities. A major challenge in pricing these securities stems from their infrequent trading and the lack of transparency in over-the-counter (OTC) markets, which leads to insufficient data for individual pricing. Nevertheless, many securities share structural similarities that can be exploited. Moreover, brokers often place small "probing" orders to infer competitors' pricing behavior. Leveraging these insights, we propose a multi-task dynamic pricing framework that leverages the shared structure across securities to enhance pricing accuracy. In the OTC market, a broker wins a quote by offering a more competitive price than rivals. The broker's goal is to learn winning prices while minimizing expected regret against a clairvoyant benchmark. We model each security using a $d$-dimensional feature vector and assume a linear contextual model for the competitor's pricing of the yield, with parameters unknown a priori. We propose the Two-Stage Multi-Task (TSMT) algorithm: first, an unregularized MLE over pooled data to obtain a coarse parameter estimate; second, a regularized MLE on individual securities to refine the parameters. We show that the TSMT achieves a regret bounded by $\tilde{O} ( δ_{\max} \sqrt{T M d} + M d ) $, outperforming both fully individual and fully pooled baselines, where $M$ is the number of securities and $δ_{\max}$ quantifies their heterogeneity.

Multi-Task Dynamic Pricing in Credit Market with Contextual Information

TL;DR

This paper develops a dynamic pricing framework for a large set of illiquid credit-market securities by exploiting latent structure across securities. It introduces the Two-Stage Multi-Task (TSMT) pricing policy, which first pools data to learn a common parameter and then refines per-security parameters with regularization, enabling automatic adaptation to inter-security similarity without prior knowledge of the similarity level. The authors prove a sublinear regret bound that scales as approximately δ_max sqrt(TMD) plus an additive term Md, and they validate the approach with both synthetic simulations and real corporate-bond data, showing improvements over fully pooled or fully individual benchmarks. The work bridges online dynamic pricing, multi-task learning, and credit-market pricing under censored feedback, offering a scalable, data-efficient method for real-time quote generation in OTC markets. Its practical impact lies in more accurate, competitive pricing across a broad security universe in settings with limited data and regulatory information constraints.

Abstract

We study the dynamic pricing problem faced by a broker seeking to learn prices for a large number of credit market securities, such as corporate bonds, government bonds, loans, and other credit-related securities. A major challenge in pricing these securities stems from their infrequent trading and the lack of transparency in over-the-counter (OTC) markets, which leads to insufficient data for individual pricing. Nevertheless, many securities share structural similarities that can be exploited. Moreover, brokers often place small "probing" orders to infer competitors' pricing behavior. Leveraging these insights, we propose a multi-task dynamic pricing framework that leverages the shared structure across securities to enhance pricing accuracy. In the OTC market, a broker wins a quote by offering a more competitive price than rivals. The broker's goal is to learn winning prices while minimizing expected regret against a clairvoyant benchmark. We model each security using a -dimensional feature vector and assume a linear contextual model for the competitor's pricing of the yield, with parameters unknown a priori. We propose the Two-Stage Multi-Task (TSMT) algorithm: first, an unregularized MLE over pooled data to obtain a coarse parameter estimate; second, a regularized MLE on individual securities to refine the parameters. We show that the TSMT achieves a regret bounded by , outperforming both fully individual and fully pooled baselines, where is the number of securities and quantifies their heterogeneity.

Paper Structure

This paper contains 45 sections, 15 theorems, 109 equations, 10 figures, 2 tables, 1 algorithm.

Key Result

Theorem 1

Under Assumptions assumption:diverse_contexts-assumption:normal_noise, Algorithm algo:two_stage ensures that

Figures (10)

  • Figure 1: Three outstanding bonds (as of October 2024) issued by Apple and the Vanguard Short-Term Bond Index ETF. Apple bonds track each other and respond to the ETF in a similar fashion.
  • Figure 2: Algorithm \ref{['algo:two_stage']} adaptively matches the performance of the pooling strategy and the individual learning strategy, without knowing $\delta_{\max}$.
  • Figure 3: An illustration of the estimator trajectory. In this example, we set $d=30, M=20, T=2048, \delta_{\max}=0.3$ and a uniform arrival distribution $\boldsymbol{\pi}$. We visualize the trajectory by projecting the coefficients to 2 (out of 30) dimensions. The coefficient $\boldsymbol{\theta}_\star^j$ of bond $j$ is shown by the arrow. Multi-task estimators $\hat{\boldsymbol{\theta}}_{(k)}^{j}$ in different episodes (denoted by blue crosses) are connected by blue dashed lines. Likewise, individual learning estimators are in light coral.
  • Figure 4: The estimation error of multi-task learning and individual learning for the example in Figure \ref{['fig:trajectory']}. The dashed lines indicate the estimation error of individual learning and multi-task learning, respectively.
  • Figure 5: Regrets across diverse problem configurations under uniform arrivals are compared against two benchmark policies: individual learning and pooling. The solid curves depict regrets averaged over 50 random instances, while the shaded areas denote the associated plus/minus one standard deviation ranges. Our observations consistently show that multi-task learning outperforms or has a comparable performance with the other two strategies. In the case when the multi-task learning is not the best among the three, it tends to be close to the best one.
  • ...and 5 more figures

Theorems & Definitions (21)

  • Remark 1: Formulation and information structure
  • Remark 2: Linear form of $y_t$
  • Remark 3: Pricing vs inventory control
  • Remark 4: Exogeneity of competitors quotes
  • Theorem 1
  • Remark 5: The additive $\mathcal{O} \left({Md}\right)$ term
  • Corollary 1
  • Remark 6: Connections with literature
  • Lemma 1: Stage I Estimation Error
  • Lemma 2: Stage I Expectation Bound
  • ...and 11 more