An Experimental Study of Semantic Continuity for Deep Learning Models
Shangxi Wu, Dongyuan Lu, Xian Zhao, Lizhang Chen, Jitao Sang
TL;DR
The paper addresses semantic discontinuity, where small semantic perturbations cause large output changes in deep models, undermining robustness and interpretability. It introduces a semantic continuity constraint that enforces smooth gradients by minimizing the semantic-change metric $DS(x,x')$ between original inputs $x$ and non-semantic perturbations $x'$, via the loss $Loss = Loss + \alpha Loss_{continuity}$ with $Loss_{continuity} = DS(x,x')$ and $x' = P(x)$. Empirical validation on ImageNet and CIFAR-100 with ResNet variants shows reduced semantic-discontinuity (lower $DS$), improved adversarial robustness, clearer and more focal explanations (IG, GradCAM, LIME), better transferability, and reduced bias on Color MNIST. The work highlights that aligning models with semantic neighborhoods yields tangible gains across trustworthiness metrics and offers a practical pathway to more human-aligned perception in deep learning systems.
Abstract
Deep learning models suffer from the problem of semantic discontinuity: small perturbations in the input space tend to cause semantic-level interference to the model output. We argue that the semantic discontinuity results from these inappropriate training targets and contributes to notorious issues such as adversarial robustness, interpretability, etc. We first conduct data analysis to provide evidence of semantic discontinuity in existing deep learning models, and then design a simple semantic continuity constraint which theoretically enables models to obtain smooth gradients and learn semantic-oriented features. Qualitative and quantitative experiments prove that semantically continuous models successfully reduce the use of non-semantic information, which further contributes to the improvement in adversarial robustness, interpretability, model transfer, and machine bias.
