GEML: A Grammar-based Evolutionary Machine Learning Approach for Design-Pattern Detection
Rafael Barbudo, Aurora Ramírez, Francisco Servant, José Raúl Romero
TL;DR
This work introduces GEML, a grammar-based evolutionary machine learning approach for automatic design pattern detection. It combines associative classification with grammar-guided genetic programming to learn readable rule-based detectors described by a context-free grammar, enabling flexible, per-pattern learning without extensive parameter tuning. Through parameter studies and cross-pattern experiments on DPB and P-Mart, GEML demonstrates competitive accuracy and robustness, outperforming some ML and non-ML DPD methods while maintaining interpretability. A demonstration tool accompanies the method, illustrating practical deployment, customization for new patterns, and potential integration into development workflows. The approach offers a scalable, human-readable alternative for DP detection with strong adaptability to organizational coding practices.
Abstract
Design patterns (DPs) are recognised as a good practice in software development. However, the lack of appropriate documentation often hampers traceability, and their benefits are blurred among thousands of lines of code. Automatic methods for DP detection have become relevant but are usually based on the rigid analysis of either software metrics or specific properties of the source code. We propose GEML, a novel detection approach based on evolutionary machine learning using software properties of diverse nature. Firstly, GEML makes use of an evolutionary algorithm to extract those characteristics that better describe the DP, formulated in terms of human-readable rules, whose syntax is conformant with a context-free grammar. Secondly, a rule-based classifier is built to predict whether new code contains a hidden DP implementation. GEML has been validated over five DPs taken from a public repository recurrently adopted by machine learning studies. Then, we increase this number up to 15 diverse DPs, showing its effectiveness and robustness in terms of detection capability. An initial parameter study served to tune a parameter setup whose performance guarantees the general applicability of this approach without the need to adjust complex parameters to a specific pattern. Finally, a demonstration tool is also provided.
