Enhancing Deep Learning Model Robustness through Metamorphic Re-Training
Said Togru, Youssef Sameh Mostafa, Karim Lotfy
TL;DR
This study tackles the challenge of robust generalization in deep learning under limited labeled data by introducing a Metamorphic Retraining Framework that couples metamorphic testing with semi-supervised learning. The approach iteratively generates metamorphic-transformed data, retrains models with algorithms like FixMatch, FlexMatch, MixMatch, and FullMatch, and evaluates robustness via metamorphic test success rates, while balancing accuracy on held-out data. Key findings show adaptive metamorphic retraining generally improves robustness and yields favorable accuracy-robustness trade-offs, with pretrained models offering additional gains; however, benefits vary with data abundance and transformation type. The work demonstrates a scalable, parallelizable pipeline that can leverage unlabeled data and metamorphic relations to enhance model reliability in real-world scenarios where labeled data are scarce, with implications for deploying robust AI in safety-critical domains.
Abstract
This paper evaluates the use of metamorphic relations to enhance the robustness and real-world performance of machine learning models. We propose a Metamorphic Retraining Framework, which applies metamorphic relations to data and utilizes semi-supervised learning algorithms in an iterative and adaptive multi-cycle process. The framework integrates multiple semi-supervised retraining algorithms, including FixMatch, FlexMatch, MixMatch, and FullMatch, to automate the retraining, evaluation, and testing of models with specified configurations. To assess the effectiveness of this approach, we conducted experiments on CIFAR-10, CIFAR-100, and MNIST datasets using a variety of image processing models, both pretrained and non-pretrained. Our results demonstrate the potential of metamorphic retraining to significantly improve model robustness as we show in our results that each model witnessed an increase of an additional flat 17 percent on average in our robustness metric.
