Towards Benchmarking Design Pattern Detection Under Obfuscation: Reproducing and Evaluating Attention-Based Detection Method
Manthan Shenoy, Andreas Rausch
TL;DR
This work interrogates whether attention-based design pattern detectors truly understand software semantics or rely on superficial syntactic cues. By reproducing DPDAtt and curating a semantically preserved yet identifier-obfuscated 34-file Java subset across 13 GoF patterns, the authors demonstrate that removing naming signals causes a dramatic drop in detection performance. The results reveal a strong dependence on token-level cues and offer a minimal, reproducible benchmark to assess semantic robustness of pattern detectors. The proposed Obfuscated Corpus serves as a practical probe for evaluating true semantic generalization in design pattern detection tools, guiding future development toward semantically aware methods.
Abstract
This paper investigates the semantic robustness of attention-based classifiers for design pattern detection, particularly focusing on their reliance on structural and behavioral semantics. We reproduce the DPDAtt, an attention-based design pattern detection approach using learning-based classifiers, and evaluate its performance under obfuscation. To this end, we curate an obfuscated version of the DPDAtt Corpus, where the name identifiers in code such as class names, method names, etc., and string literals like print statements and comment blocks are replaced while preserving control flow, inheritance, and logic. Our findings reveal that these trained classifiers in DPDAtt depend significantly on superficial syntactic features, leading to substantial misclassification when such cues are removed through obfuscation. This work highlights the need for more robust detection tools capable of capturing deeper semantic meanings in source code. We propose our curated Obfuscated corpus (containing 34 Java source files) as a reusable proof-of-concept benchmark for evaluating state-of-the-art design pattern detectors on their true semantic generalization capabilities.
