Table of Contents
Fetching ...

DPS: Design Pattern Summarisation Using Code Features

Najam Nazar, Sameer Sikka, Christoph Treude

TL;DR

This work introduces the Design Pattern Summariser (DPS), a code-feature driven pipeline that uses JavaParser to convert Java design-pattern implementations into a JSON representation and SimpleNLG to generate pattern-aware natural-language summaries. DPS aims to capture both class-level and method-level context around GoF design patterns, enabling automated, context-rich documentation of patterns in large codebases. Through extensive corpus construction (DPS-Corpus), feature expansion (20 features), and multi-faceted evaluation (BLEU-4, NIST, ROUGE-L, FrugalScore, and expert surveys), DPS demonstrates competitive alignment with human-written summaries and improved comprehension efficiency. The study discusses limitations, threat models, and avenues for future work, including integrating LLM-based approaches and expanding language diversity, to advance automated design-pattern documentation.

Abstract

Automatic summarisation has been used efficiently in recent years to condense texts, conversations, audio, code, and various other artefacts. A range of methods, from simple template-based summaries to complex machine learning techniques -- and more recently, large language models -- have been employed to generate these summaries. Summarising software design patterns is important because it helps developers quickly understand and reuse complex design concepts, thereby improving software maintainability and development efficiency. However, the generation of summaries for software design patterns has not yet been explored. Our approach utilises code features and JavaParser to parse the code and create a JSON representation. Using an NLG library on this JSON representation, we convert it into natural language text that acts as a summary of the code, capturing the contextual information of the design pattern. Our empirical results indicate that the summaries generated by our approach capture the context in which patterns are applied in the codebase. Statistical evaluations demonstrate that our summaries closely align with human-written summaries, as evident from high values in the ROUGE-L, BLEU-4, NIST, and FrugalScore metrics. A follow-up survey further shows that DPS summaries were rated as capturing context better than human-generated summaries. Additionally, a time based task activity shows that summaries increase the time of understanding of design pattern for developer better than when the summaries are not present.

DPS: Design Pattern Summarisation Using Code Features

TL;DR

This work introduces the Design Pattern Summariser (DPS), a code-feature driven pipeline that uses JavaParser to convert Java design-pattern implementations into a JSON representation and SimpleNLG to generate pattern-aware natural-language summaries. DPS aims to capture both class-level and method-level context around GoF design patterns, enabling automated, context-rich documentation of patterns in large codebases. Through extensive corpus construction (DPS-Corpus), feature expansion (20 features), and multi-faceted evaluation (BLEU-4, NIST, ROUGE-L, FrugalScore, and expert surveys), DPS demonstrates competitive alignment with human-written summaries and improved comprehension efficiency. The study discusses limitations, threat models, and avenues for future work, including integrating LLM-based approaches and expanding language diversity, to advance automated design-pattern documentation.

Abstract

Automatic summarisation has been used efficiently in recent years to condense texts, conversations, audio, code, and various other artefacts. A range of methods, from simple template-based summaries to complex machine learning techniques -- and more recently, large language models -- have been employed to generate these summaries. Summarising software design patterns is important because it helps developers quickly understand and reuse complex design concepts, thereby improving software maintainability and development efficiency. However, the generation of summaries for software design patterns has not yet been explored. Our approach utilises code features and JavaParser to parse the code and create a JSON representation. Using an NLG library on this JSON representation, we convert it into natural language text that acts as a summary of the code, capturing the contextual information of the design pattern. Our empirical results indicate that the summaries generated by our approach capture the context in which patterns are applied in the codebase. Statistical evaluations demonstrate that our summaries closely align with human-written summaries, as evident from high values in the ROUGE-L, BLEU-4, NIST, and FrugalScore metrics. A follow-up survey further shows that DPS summaries were rated as capturing context better than human-generated summaries. Additionally, a time based task activity shows that summaries increase the time of understanding of design pattern for developer better than when the summaries are not present.

Paper Structure

This paper contains 26 sections, 4 equations, 2 figures, 8 tables.

Figures (2)

  • Figure 1: JSON representation of the ComputerPart class from AbdurRKhalid project (alt text: JSON excerpt showing keys for classes, methods, and pattern roles used by DPS to generate summaries)
  • Figure 2: DPS: Our design pattern summarisation approach (alt text: block diagram showing parsing to AST, pattern checker producing JSON, and summary generator producing natural-language descriptions)