Approach to Finding a Robust Deep Learning Model
Alexey Boldyrev, Fedor Ratnikov, Andrey Shevelev
TL;DR
This work tackles the problem of ensuring reliability in deep learning predictions by proposing a robustness-detection framework and a meta-algorithm for automated model selection. The authors apply this approach to small CNNs evaluating energy and position reconstruction in calorimeter simulations, systematically varying training sample size, weight initialization, and inductive bias. Their results show that a carefully chosen model selection process can identify robust models faster and with fewer instances than exhaustive searches, with inductive bias reducing the required data for robust performance. The study provides practical implications for AutoML and fault-tolerant ML applications, illustrating how robustness, rather than peak accuracy alone, can guide model choice in complex, distribution-shifting environments.
Abstract
The rapid development of machine learning (ML) and artificial intelligence (AI) applications requires the training of large numbers of models. This growing demand highlights the importance of training models without human supervision, while ensuring that their predictions are reliable. In response to this need, we propose a novel approach for determining model robustness. This approach, supplemented with a proposed model selection algorithm designed as a meta-algorithm, is versatile and applicable to any machine learning model, provided that it is appropriate for the task at hand. This study demonstrates the application of our approach to evaluate the robustness of deep learning models. To this end, we study small models composed of a few convolutional and fully connected layers, using common optimizers due to their ease of interpretation and computational efficiency. Within this framework, we address the influence of training sample size, model weight initialization, and inductive bias on the robustness of deep learning models.
