CIDRe: A Reference-Free Multi-Aspect Criterion for Code Comment Quality Measurement
Maria Dziuba, Valentin Malykh
TL;DR
CIDRe introduces a reference-free, four-component criterion for code-comment quality that jointly measures Completeness, Informativeness, Description Length, and Relevance to assess structured comments. The method combines transformer-based informativeness weighting, multilingual embeddings, and a SIDE-inspired relevance module, producing a score in $[0,1]$ that feeds a binary good/bad classifier. On the StRuCom dataset and an independently labeled test set, CIDRe outperforms existing metrics in cross-entropy calibration, with an ablation showing full component synergy and a side-by-side evaluation demonstrating cross-language and cross-model effectiveness when filtering data. The work demonstrates practical gains in generation quality for Russian-language code documentation and points to future multilingual extensions and broader applicability.
Abstract
Effective generation of structured code comments requires robust quality metrics for dataset curation, yet existing approaches (SIDE, MIDQ, STASIS) suffer from limited code-comment analysis. We propose CIDRe, a language-agnostic reference-free quality criterion combining four synergistic aspects: (1) relevance (code-comment semantic alignment), (2) informativeness (functional coverage), (3) completeness (presence of all structure sections), and (4) description length (detail sufficiency). We validate our criterion on a manually annotated dataset. Experiments demonstrate CIDRe's superiority over existing metrics, achieving improvement in cross-entropy evaluation. When applied to filter comments, the models finetuned on CIDRe-filtered data show statistically significant quality gains in GPT-4o-mini assessments.
