Table of Contents
Fetching ...

Still Manual? Automated Linter Configuration via DSL-Based LLM Compilation of Coding Standards

Zejun Zhang, Yixin Gan, Zhenchang Xing, Tian Zhang, Yi Li, Xiwei Xu, Qinghua Lu, Liming Zhu

TL;DR

The paper tackles the laborious process of configuring code linters to enforce coding standards by proposing LintCFG, a DSL-driven, LLM-based compilation pipeline that translates natural-language coding standards into tool-specific linter configurations. It introduces a generic DSL to express rules independent of languages or linters, and a multi-stage pipeline (NL to DSL parsing, configuration-name/option selection, alignment checking, and final translation to XML/JSON) inspired by compiler design. Through extensive evaluation on Google Java/Checkstyle and Google JavaScript/ESLint, the approach achieves high DSL representation accuracy (often >80% Acc and >90% P/R) and strong linter-configuration metrics, outperforming baselines by large margins, and a user study demonstrates substantial efficiency gains for developers. The work further demonstrates generality by generating ESLint configurations for JavaScript, suggesting broad applicability across languages and tooling, and provides benchmarks to guide future research and linter evolution. Overall, the DSL-based transpilation framework reduces manual effort, improves correctness, and enables scalable, cross-language linter configuration generation with practical impact for software engineering teams.

Abstract

Coding standards are essential for maintaining consistent and high-quality code across teams and projects. Linters help developers enforce these standards by detecting code violations. However, manual linter configuration is complex and expertise-intensive, and the diversity and evolution of programming languages, coding standards, and linters lead to repetitive and maintenance-intensive configuration work. To reduce manual effort, we propose LintCFG, a domain-specific language (DSL)-driven, LLM-based compilation approach to automate linter configuration generation for coding standards, independent of programming languages, coding standards, and linters. Inspired by compiler design, we first design a DSL to express coding rules in a tool-agnostic, structured, readable, and precise manner. Then, we build linter configurations into DSL configuration instructions. For a given natural language coding standard, the compilation process parses it into DSL coding standards, matches them with the DSL configuration instructions to set configuration names, option names and values, verifies consistency between the standards and configurations, and finally generates linter-specific configurations. Experiments with Checkstyle for Java coding standard show that our approach achieves over 90% precision and recall in DSL representation, with accuracy, precision, recall, and F1-scores close to 70% (with some exceeding 70%) in fine-grained linter configuration generation. Notably, our approach outperforms baselines by over 100% in precision. A user study further shows that our approach improves developers' efficiency in configuring linters for coding standards. Finally, we demonstrate the generality of the approach by generating ESLint configurations for JavaScript coding standards, showcasing its broad applicability across other programming languages, coding standards, and linters.

Still Manual? Automated Linter Configuration via DSL-Based LLM Compilation of Coding Standards

TL;DR

The paper tackles the laborious process of configuring code linters to enforce coding standards by proposing LintCFG, a DSL-driven, LLM-based compilation pipeline that translates natural-language coding standards into tool-specific linter configurations. It introduces a generic DSL to express rules independent of languages or linters, and a multi-stage pipeline (NL to DSL parsing, configuration-name/option selection, alignment checking, and final translation to XML/JSON) inspired by compiler design. Through extensive evaluation on Google Java/Checkstyle and Google JavaScript/ESLint, the approach achieves high DSL representation accuracy (often >80% Acc and >90% P/R) and strong linter-configuration metrics, outperforming baselines by large margins, and a user study demonstrates substantial efficiency gains for developers. The work further demonstrates generality by generating ESLint configurations for JavaScript, suggesting broad applicability across languages and tooling, and provides benchmarks to guide future research and linter evolution. Overall, the DSL-based transpilation framework reduces manual effort, improves correctness, and enables scalable, cross-language linter configuration generation with practical impact for software engineering teams.

Abstract

Coding standards are essential for maintaining consistent and high-quality code across teams and projects. Linters help developers enforce these standards by detecting code violations. However, manual linter configuration is complex and expertise-intensive, and the diversity and evolution of programming languages, coding standards, and linters lead to repetitive and maintenance-intensive configuration work. To reduce manual effort, we propose LintCFG, a domain-specific language (DSL)-driven, LLM-based compilation approach to automate linter configuration generation for coding standards, independent of programming languages, coding standards, and linters. Inspired by compiler design, we first design a DSL to express coding rules in a tool-agnostic, structured, readable, and precise manner. Then, we build linter configurations into DSL configuration instructions. For a given natural language coding standard, the compilation process parses it into DSL coding standards, matches them with the DSL configuration instructions to set configuration names, option names and values, verifies consistency between the standards and configurations, and finally generates linter-specific configurations. Experiments with Checkstyle for Java coding standard show that our approach achieves over 90% precision and recall in DSL representation, with accuracy, precision, recall, and F1-scores close to 70% (with some exceeding 70%) in fine-grained linter configuration generation. Notably, our approach outperforms baselines by over 100% in precision. A user study further shows that our approach improves developers' efficiency in configuring linters for coding standards. Finally, we demonstrate the generality of the approach by generating ESLint configurations for JavaScript coding standards, showcasing its broad applicability across other programming languages, coding standards, and linters.
Paper Structure (42 sections, 5 figures, 8 tables)

This paper contains 42 sections, 5 figures, 8 tables.

Figures (5)

  • Figure 1: Approach overview of generating linter configurations for coding standards
  • Figure 2: Comparison of coding rule representations (NL, linter configuration, and DSL)
  • Figure 3: Building a linter configuration into the DSL configuration instruction from the linter document
  • Figure 4: Compilation from a NL coding standard to linter configuration
  • Figure 5: Prompt template of baselines