A Stochastic Approach to Classification Error Estimates in Convolutional Neural Networks
Jan Peleska, Felix Brüning, Mario Gleirscher, Wen-ling Huang
TL;DR
The work presents a grey-box framework for certifying CNN-based obstacle detection within GoA 4 autonomous freight trains by coupling system-level risk analysis with a mathematical CNN model. It introduces classification clusters and equivalence classes to enable targeted residual-error estimation and develops two statistical strategies—a model-agnostic Monte Carlo method and a white-box, equivalence-class-based method—paired with parametric stochastic model checking. The findings show that a 3oo3 fusion of independent 2oo2 OD modules can meet a tolerable hazard rate of $10^{-7}/h$ under UL4600, illustrating practical certifiability for freight/metro applications while highlighting high-speed rail challenges. The approach advances trustworthy AI integration in railway safety by providing structured, quantitative confidence in CNN-based perception within certification frameworks. Future work targets real-world data, color imagery, and expanded sensor fusion to broaden applicability and robustness.
Abstract
This technical report presents research results achieved in the field of verification of trained Convolutional Neural Network (CNN) used for image classification in safety-critical applications. As running example, we use the obstacle detection function needed in future autonomous freight trains with Grade of Automation (GoA) 4. It is shown that systems like GoA 4 freight trains are indeed certifiable today with new standards like ANSI/UL 4600 and ISO 21448 used in addition to the long-existing standards EN 50128 and EN 50129. Moreover, we present a quantitative analysis of the system-level hazard rate to be expected from an obstacle detection function. It is shown that using sensor/perceptor fusion, the fused detection system can meet the tolerable hazard rate deemed to be acceptable for the safety integrity level to be applied (SIL-3). A mathematical analysis of CNN models is performed which results in the identification of classification clusters and equivalence classes partitioning the image input space of the CNN. These clusters and classes are used to introduce a novel statistical testing method for determining the residual error probability of a trained CNN and an associated upper confidence limit. We argue that this greybox approach to CNN verification, taking into account the CNN model's internal structure, is essential for justifying that the statistical tests have covered the trained CNN with its neurons and inter-layer mappings in a comprehensive way.
