Towards a Framework for Deep Learning Certification in Safety-Critical Applications Using Inherently Safe Design and Run-Time Error Detection
Romeo Valentin
TL;DR
This work addresses the certification gap for deep learning in safety-critical applications by proposing a framework that combines inherently safe design with run-time error detection. Grounded in a runway pose-estimation use case, it argues for recovering disentangled semantic variables through semi-supervised learning, enabling robust regression, uncertainty quantification, and OOD detection without relying on regression labels. It surveys industry regulation, proposes a taxonomy of safety methodologies, and develops a concrete model structure that uses weak supervision to learn content representations with a simple linear head and ensemble-based OOD detection. The approach aims to provide verifiable guarantees by enforcing disentanglement, structured priors, and principled run-time checks, offering a pathway toward certifiable DL in domains like aviation and autonomous systems. The work also discusses the practical limits of current adversarial defenses and highlights the need for ongoing collaboration between regulatory bodies and ML researchers to advance safe deployment of AI in critical infrastructure.
Abstract
Although an ever-growing number of applications employ deep learning based systems for prediction, decision-making, or state estimation, almost no certification processes have been established that would allow such systems to be deployed in safety-critical applications. In this work we consider real-world problems arising in aviation and other safety-critical areas, and investigate their requirements for a certified model. To this end, we investigate methodologies from the machine learning research community aimed towards verifying robustness and reliability of deep learning systems, and evaluate these methodologies with regard to their applicability to real-world problems. Then, we establish a new framework towards deep learning certification based on (i) inherently safe design, and (ii) run-time error detection. Using a concrete use case from aviation, we show how deep learning models can recover disentangled variables through the use of weakly-supervised representation learning. We argue that such a system design is inherently less prone to common model failures, and can be verified to encode underlying mechanisms governing the data. Then, we investigate four techniques related to the run-time safety of a model, namely (i) uncertainty quantification, (ii) out-of-distribution detection, (iii) feature collapse, and (iv) adversarial attacks. We evaluate each for their applicability and formulate a set of desiderata that a certified model should fulfill. Finally, we propose a novel model structure that exhibits all desired properties discussed in this work, and is able to make regression and uncertainty predictions, as well as detect out-of-distribution inputs, while requiring no regression labels to train. We conclude with a discussion of the current state and expected future progress of deep learning certification, and its industrial and social implications.
