Table of Contents
Fetching ...

Development and Evolution of Xtext-based DSLs on GitHub: An Empirical Investigation

Weixing Zhang, Daniel Strüber, Regina Hebig

TL;DR

The paper presents a large-scale empirical study of Xtext-based DSL development and evolution on GitHub, mining 1002 repositories and analyzing 226 fully developed languages across 18 domains. It characterizes artifacts (grammars, metamodels, MWE2, and instances), development scenarios (grammar-driven, metamodel-driven, and retrofitting), and evolution patterns, revealing frequent grammar and instance updates and a predominance of perfective changes. A substantial portion of projects shows co-evolution among DSL artifacts, while many repositories lack complete instances or documentation, highlighting practical challenges in DSL maintenance. The authors provide a dataset of repository meta-information to support future research and tool development for DSL evolution, including co-evolution approaches and improved versioning and testing practices. Overall, the work illuminates how Xtext-based DSLs are built and evolve in practice, offering actionable insights for the MDE community and a resource to drive methodological advances.

Abstract

Domain-specific languages (DSLs) play a crucial role in facilitating a wide range of software development activities in the context of model-driven engineering (MDE). However, a systematic understanding of their evolution is lacking, which hinders methodology and tool development. To address this gap, we performed a comprehensive investigation into the development and evolution of textual DSLs created with Xtext, a particularly widely used language workbench in the MDE. We systematically identified and analyzed 1002 GitHub repositories containing Xtext-related projects. A manual classification of the repositories brought forward 226 ones that contain a fully developed language. These were further categorized into 18 application domains, where we examined DSL artifacts and the availability of example instances. We explored DSL development practices, including development scenarios, evolution activities, and co-evolution of related artifacts. We observed that DSLs are used more, evolve faster, and are maintained longer in specific domains, such as Data Management and Databases. We identified DSL grammar definitions in 722 repositories, but only a third provided textual instances, with most utilizing over 60% of grammar rules. We found that most analyzed DSLs followed a grammar-driven approach, though some adopted a metamodel-driven approach. Additionally, we observed a trend of retrofitting existing languages in Xtext, demonstrating its flexibility beyond new DSL creation. We found that in most DSL development projects, updates to grammar definitions and example instances are very frequent, and most of the evolution activities can be classified as ``perfective'' changes. To support the research in the model-driven engineering community, we contribute a dataset of repositories with meta-information, helping to develop improved tools for DSL evolution.

Development and Evolution of Xtext-based DSLs on GitHub: An Empirical Investigation

TL;DR

The paper presents a large-scale empirical study of Xtext-based DSL development and evolution on GitHub, mining 1002 repositories and analyzing 226 fully developed languages across 18 domains. It characterizes artifacts (grammars, metamodels, MWE2, and instances), development scenarios (grammar-driven, metamodel-driven, and retrofitting), and evolution patterns, revealing frequent grammar and instance updates and a predominance of perfective changes. A substantial portion of projects shows co-evolution among DSL artifacts, while many repositories lack complete instances or documentation, highlighting practical challenges in DSL maintenance. The authors provide a dataset of repository meta-information to support future research and tool development for DSL evolution, including co-evolution approaches and improved versioning and testing practices. Overall, the work illuminates how Xtext-based DSLs are built and evolve in practice, offering actionable insights for the MDE community and a resource to drive methodological advances.

Abstract

Domain-specific languages (DSLs) play a crucial role in facilitating a wide range of software development activities in the context of model-driven engineering (MDE). However, a systematic understanding of their evolution is lacking, which hinders methodology and tool development. To address this gap, we performed a comprehensive investigation into the development and evolution of textual DSLs created with Xtext, a particularly widely used language workbench in the MDE. We systematically identified and analyzed 1002 GitHub repositories containing Xtext-related projects. A manual classification of the repositories brought forward 226 ones that contain a fully developed language. These were further categorized into 18 application domains, where we examined DSL artifacts and the availability of example instances. We explored DSL development practices, including development scenarios, evolution activities, and co-evolution of related artifacts. We observed that DSLs are used more, evolve faster, and are maintained longer in specific domains, such as Data Management and Databases. We identified DSL grammar definitions in 722 repositories, but only a third provided textual instances, with most utilizing over 60% of grammar rules. We found that most analyzed DSLs followed a grammar-driven approach, though some adopted a metamodel-driven approach. Additionally, we observed a trend of retrofitting existing languages in Xtext, demonstrating its flexibility beyond new DSL creation. We found that in most DSL development projects, updates to grammar definitions and example instances are very frequent, and most of the evolution activities can be classified as ``perfective'' changes. To support the research in the model-driven engineering community, we contribute a dataset of repositories with meta-information, helping to develop improved tools for DSL evolution.

Paper Structure

This paper contains 46 sections, 8 figures, 7 tables.

Figures (8)

  • Figure 1: Overall process
  • Figure 2: Example of Evolution History.
  • Figure 3: Count of repositories in different steps.
  • Figure 4: Classification of repositories.
  • Figure 5: Proportion of repositories in different categories.
  • ...and 3 more figures