How good are my search strings? Reflections on using an existing review as a quasi-gold standard
Huynh Khanh Vi Tran, Jürgen Börstler, Nauman Bin Ali, Michael Unterkalmsteiner
TL;DR
The paper investigates how to assess the quality of search strings in systematic literature studies by using a quasi-gold standard derived from an existing SLS. Through a comparative analysis of two tertiary studies (TAQ and ST), it reveals gaps and biases in QGS construction and the limitations of relying solely on recall/precision for validation. It then proposes extended guidelines that add an automated-search-analysis step and emphasizes QGS desirability characteristics (relevance, size, diversity) to improve search completeness. The work offers practical guidance for constructing more reliable search strategies in evidence-based software engineering and calls for broader validation across topics.
Abstract
Background: Systematic literature studies (SLS) have become a core research methodology in Evidence-based Software Engineering (EBSE). Search completeness, ie, finding all relevant papers on the topic of interest, has been recognized as one of the most commonly discussed validity issues of SLSs. Aim: This study aims at raising awareness on the issues related to search string construction and on search validation using a quasi-gold standard (QGS). Furthermore, we aim at providing guidelines for search string validation. Method: We use a recently completed tertiary study as a case and complement our findings with the observations from other researchers studying and advancing EBSE. Results: We found that the issue of assessing QGS quality has not seen much attention in the literature, and the validation of automated searches in SLSs could be improved. Hence, we propose to extend the current search validation approach by the additional analysis step of the automated search validation results and provide recommendations for the QGS construction. Conclusion: In this paper, we report on new issues which could affect search completeness in SLSs. Furthermore, the proposed guideline and recommendations could help researchers implement a more reliable search strategy in their SLSs.
