Table of Contents
Fetching ...

A Generative Approach to Credit Prediction with Learnable Prompts for Multi-scale Temporal Representation Learning

Yu Lei, Zixuan Wang, Yiqing Feng, Junru Zhang, Yahui Li, Chu Liu, Tongyao Wang

TL;DR

FinLangNet tackles industrial credit risk by reframing creditworthiness as a multi-scale, evolving prediction task across future horizons. It integratess a non-sequential DeepFM module for static features with a sequential SRG module that uses a dual-prompt mechanism to capture both fine-grained and holistic temporal patterns, optimized by a hybrid loss with dynamic weighting. The method achieves a 7.2% KS improvement over XGBoost and a 9.9% relative reduction in bad debt on a real-world DiDi Finance dataset, and also attains state-of-the-art results on the UEA time-series benchmarks, demonstrating robust generalization. It is deployed in production with real-time inference and interpretable explanations via LIME, illustrating practical impact in large-scale credit decisioning and portfolio management. Future work includes cross-market transfer learning and applying the framework to other risk domains.

Abstract

Recent industrial credit scoring models remain heavily reliant on manually tuned statistical learning methods. Despite their potential, deep learning architectures have struggled to consistently outperform traditional statistical models in industrial credit scoring, largely due to the complexity of heterogeneous financial data and the challenge of modeling evolving creditworthiness. To bridge this gap, we introduce FinLangNet, a novel framework that reformulates credit scoring as a multi-scale sequential learning problem. FinLangNet processes heterogeneous financial data through a dual-module architecture that combines tabular feature extraction with temporal sequence modeling, generating probability distributions of users' future financial behaviors across multiple time horizons. A key innovation is our dual-prompt mechanism within the sequential module, which introduces learnable prompts operating at both feature-level granularity for capturing fine-grained temporal patterns and user-level granularity for aggregating holistic risk profiles. In extensive evaluations, FinLangNet significantly outperforms a production XGBoost system, achieving a 7.2% improvement in the KS metric and a 9.9% relative reduction in bad debt rate. Its effectiveness as a general-purpose sequential learning framework is further validated through state-of-the-art performance on the public UEA time series classification benchmark. The system has been successfully deployed on DiDi's international finance platform, serving leading financial credit companies in Latin America.

A Generative Approach to Credit Prediction with Learnable Prompts for Multi-scale Temporal Representation Learning

TL;DR

FinLangNet tackles industrial credit risk by reframing creditworthiness as a multi-scale, evolving prediction task across future horizons. It integratess a non-sequential DeepFM module for static features with a sequential SRG module that uses a dual-prompt mechanism to capture both fine-grained and holistic temporal patterns, optimized by a hybrid loss with dynamic weighting. The method achieves a 7.2% KS improvement over XGBoost and a 9.9% relative reduction in bad debt on a real-world DiDi Finance dataset, and also attains state-of-the-art results on the UEA time-series benchmarks, demonstrating robust generalization. It is deployed in production with real-time inference and interpretable explanations via LIME, illustrating practical impact in large-scale credit decisioning and portfolio management. Future work includes cross-market transfer learning and applying the framework to other risk domains.

Abstract

Recent industrial credit scoring models remain heavily reliant on manually tuned statistical learning methods. Despite their potential, deep learning architectures have struggled to consistently outperform traditional statistical models in industrial credit scoring, largely due to the complexity of heterogeneous financial data and the challenge of modeling evolving creditworthiness. To bridge this gap, we introduce FinLangNet, a novel framework that reformulates credit scoring as a multi-scale sequential learning problem. FinLangNet processes heterogeneous financial data through a dual-module architecture that combines tabular feature extraction with temporal sequence modeling, generating probability distributions of users' future financial behaviors across multiple time horizons. A key innovation is our dual-prompt mechanism within the sequential module, which introduces learnable prompts operating at both feature-level granularity for capturing fine-grained temporal patterns and user-level granularity for aggregating holistic risk profiles. In extensive evaluations, FinLangNet significantly outperforms a production XGBoost system, achieving a 7.2% improvement in the KS metric and a 9.9% relative reduction in bad debt rate. Its effectiveness as a general-purpose sequential learning framework is further validated through state-of-the-art performance on the public UEA time series classification benchmark. The system has been successfully deployed on DiDi's international finance platform, serving leading financial credit companies in Latin America.
Paper Structure (35 sections, 17 equations, 6 figures, 8 tables)

This paper contains 35 sections, 17 equations, 6 figures, 8 tables.

Figures (6)

  • Figure 1: The dataset consists of two parts: Non-Sequential Data and Sequential Data. Non-Sequential Data includes static user information. Sequential Data represents time-dependent information. Together, they provide a comprehensive view of user characteristics and behaviors over time.
  • Figure 2: FinLangNet Framework Overview. The architecture incorporates two pivotal sub-modules to harness both sequential and non-sequential data effectively. The Sequential Module(SRG) for sequential features. These components are independently trained during the intermediary phase.
  • Figure 3: Model Performance on the UEA archive.
  • Figure 4: Overview of Deployment. (a) Online deployment and data flow. (b) Default‑rate matrix by score quantiles. (c) Risk–approval trade‑off.
  • Figure 5: Performance comparison at various risk thresholds for $y_1(\tau = 1)$ prediction. FinLangNet demonstrates superior precision at operational thresholds (0.2-0.4) commonly used in production.
  • ...and 1 more figures