Augmenting Bankruptcy Prediction using Reported Behavior of Corporate Restructuring
Xinlin Wang, Mats Brorsson
TL;DR
This work tackles SME bankruptcy prediction by augmenting traditional accounting-based ratios with reported corporate restructuring behavior to form a hybrid dataset. It conducts a comprehensive comparison of six models (LR, RF, LightGBM, MLP, CNN-1D, LSTM) across 1-, 2-, and 3-year windows, using Luxembourg Business Registers data and evaluating robustness with pre- and post-Covid periods. The study finds that hybrid data improves predictive performance by about 4-13% in $AUC$ over single-source data and identifies LightGBM as a leading model, while also analyzing model drift during the Covid era. The results offer practical guidance for SME credit risk assessment and policy, demonstrating that incorporating restructuring behavior yields more holistic and robust bankruptcy risk predictions.
Abstract
Credit risk assessment of a company is commonly conducted by utilizing financial ratios that are derived from its financial statements. However, this approach may not fully encompass other significant aspects of a company. We propose the utilization of a hybrid dataset that combines financial statements with information about corporate restructuring behavior in order to construct diverse machine learning models to predict bankruptcy. Utilizing a hybrid data set provides a more comprehensive and holistic perspective on a company's financial position and the dynamics of its business operations. The experiments were carried out using publicly available records of all the files submitted by small and medium-sized enterprises to Luxembourg Business Registers. We conduct a comparative analysis of bankruptcy prediction using six machine learning models. Furthermore, we validate the effectiveness of the hybrid dataset. In addition to the conventional testing set, we deliberately chose the timeframe encompassing the years of the Covid-19 pandemic as an additional testing set in order to evaluate the robustness of the models. The experimental results demonstrate that the hybrid data set can improve the performance of the model by 4%-13% compared to a single source data set. We also identify suitable models for predicting bankruptcy.
