Adaptive Sentencing Prediction with Guaranteed Accuracy and Legal Interpretability
Yifei Jin, Xin Zheng, Lei Guo
TL;DR
This paper tackles interpretable and accurate sentencing prediction under nonstationary data by introducing a Saturated Mechanistic Sentencing (SMS) model anchored in Chinese Criminal Law and an online Momentum LMS (MLMS) algorithm. It establishes theoretical bounds for adaptive prediction accuracy that do not rely on stationarity or independence, including a best-possible upper bound when parameters are known. The authors construct the Chinese Intentional Bodily Harm (CIBH) dataset with 82 feature factors and demonstrate that SMS-MLMS achieves high predictive accuracy, approaching the theoretical upper bounds and outperforming strong baselines. The work has practical implications for transparent judicial decision-making and offers a framework adaptable to other crimes and domains, supported by both Lyapunov/martingale analysis and real-world data validation.
Abstract
Existing research on judicial sentencing prediction predominantly relies on end-to-end models, which often neglect the inherent sentencing logic and lack interpretability-a critical requirement for both scholarly research and judicial practice. To address this challenge, we make three key contributions:First, we propose a novel Saturated Mechanistic Sentencing (SMS) model, which provides inherent legal interpretability by virtue of its foundation in China's Criminal Law. We also introduce the corresponding Momentum Least Mean Squares (MLMS) adaptive algorithm for this model. Second, for the MLMS algorithm based adaptive sentencing predictor, we establish a mathematical theory on the accuracy of adaptive prediction without resorting to any stationarity and independence assumptions on the data. We also provide a best possible upper bound for the prediction accuracy achievable by the best predictor designed in the known parameters case. Third, we construct a Chinese Intentional Bodily Harm (CIBH) dataset. Utilizing this real-world data, extensive experiments demonstrate that our approach achieves a prediction accuracy that is not far from the best possible theoretical upper bound, validating both the model's suitability and the algorithm's accuracy.
