Table of Contents
Fetching ...

Heterogeneous Graph Pre-training Based Model for Secure and Efficient Prediction of Default Risk Propagation among Bond Issuers

Xurui Li, Xin Shan, Wenhao Yin, Haijiao Wang

TL;DR

This work tackles default risk propagation among bond issuers by addressing data scarcity and privacy concerns who limit end-to-end GNN performance. It introduces a two-stage framework where a heterogeneous graph autoencoder (HGMAE) is pre-trained on a comprehensive Enterprise Knowledge Graph (EKG) with multiple edge types, and the resulting embeddings are fused with bond-issuer task features to predict propagation probabilities. The HGMAE employs masking across both the original graph and isomorphic subgraphs, using a scaled cosine reconstruction loss, to capture feature diffusion in a heterogeneous setting. Empirical results on a real China bond market dataset show the two-stage HGMAE approach with an XGBoost classifier outperforms baselines, underscoring the value of secure pre-training and edge-type-aware representation learning for financial risk assessment.

Abstract

Efficient prediction of default risk for bond-issuing enterprises is pivotal for maintaining stability and fostering growth in the bond market. Conventional methods usually rely solely on an enterprise's internal data for risk assessment. In contrast, graph-based techniques leverage interconnected corporate information to enhance default risk identification for targeted bond issuers. Traditional graph techniques such as label propagation algorithm or deepwalk fail to effectively integrate a enterprise's inherent attribute information with its topological network data. Additionally, due to data scarcity and security privacy concerns between enterprises, end-to-end graph neural network (GNN) algorithms may struggle in delivering satisfactory performance for target tasks. To address these challenges, we present a novel two-stage model. In the first stage, we employ an innovative Masked Autoencoders for Heterogeneous Graph (HGMAE) to pre-train on a vast enterprise knowledge graph. Subsequently, in the second stage, a specialized classifier model is trained to predict default risk propagation probabilities. The classifier leverages concatenated feature vectors derived from the pre-trained encoder with the enterprise's task-specific feature vectors. Through the two-stage training approach, our model not only boosts the importance of unique bond characteristics for specific default prediction tasks, but also securely and efficiently leverage the global information pre-trained from other enterprises. Experimental results demonstrate that our proposed model outperforms existing approaches in predicting default risk for bond issuers.

Heterogeneous Graph Pre-training Based Model for Secure and Efficient Prediction of Default Risk Propagation among Bond Issuers

TL;DR

This work tackles default risk propagation among bond issuers by addressing data scarcity and privacy concerns who limit end-to-end GNN performance. It introduces a two-stage framework where a heterogeneous graph autoencoder (HGMAE) is pre-trained on a comprehensive Enterprise Knowledge Graph (EKG) with multiple edge types, and the resulting embeddings are fused with bond-issuer task features to predict propagation probabilities. The HGMAE employs masking across both the original graph and isomorphic subgraphs, using a scaled cosine reconstruction loss, to capture feature diffusion in a heterogeneous setting. Empirical results on a real China bond market dataset show the two-stage HGMAE approach with an XGBoost classifier outperforms baselines, underscoring the value of secure pre-training and edge-type-aware representation learning for financial risk assessment.

Abstract

Efficient prediction of default risk for bond-issuing enterprises is pivotal for maintaining stability and fostering growth in the bond market. Conventional methods usually rely solely on an enterprise's internal data for risk assessment. In contrast, graph-based techniques leverage interconnected corporate information to enhance default risk identification for targeted bond issuers. Traditional graph techniques such as label propagation algorithm or deepwalk fail to effectively integrate a enterprise's inherent attribute information with its topological network data. Additionally, due to data scarcity and security privacy concerns between enterprises, end-to-end graph neural network (GNN) algorithms may struggle in delivering satisfactory performance for target tasks. To address these challenges, we present a novel two-stage model. In the first stage, we employ an innovative Masked Autoencoders for Heterogeneous Graph (HGMAE) to pre-train on a vast enterprise knowledge graph. Subsequently, in the second stage, a specialized classifier model is trained to predict default risk propagation probabilities. The classifier leverages concatenated feature vectors derived from the pre-trained encoder with the enterprise's task-specific feature vectors. Through the two-stage training approach, our model not only boosts the importance of unique bond characteristics for specific default prediction tasks, but also securely and efficiently leverage the global information pre-trained from other enterprises. Experimental results demonstrate that our proposed model outperforms existing approaches in predicting default risk for bond issuers.
Paper Structure (5 sections, 2 equations, 1 figure, 1 table)

This paper contains 5 sections, 2 equations, 1 figure, 1 table.

Figures (1)

  • Figure 1: The main framework of our two-stage risk propagation prediction method. The right part shows the details for the proposed HGMAE model. Different Colors for HGMAE indicate different edge types.