Free Open Source Communities Sustainability: Does It Make a Difference in Software Quality?
Adam Alami, Raúl Pardo, Johan Linåker
TL;DR
This study investigates whether sustainability in Free Open Source Software (FOSS) communities affects software quality. Using the Linåker framework, the authors quantify 16 sustainability indicators across four themes (communication, popularity, stability, technical activity) for 217 Apache Software Foundation Incubator projects and relate them to eight software quality metrics via Bayesian Gaussian and Poisson regressions. Across most indicators, they find no consistent positive or negative impact on defect density or code coverage, though community age positively influences several code-quality submetrics like risk complexity and file/function size, while some sustainability factors can negatively affect other quality aspects. The results suggest that code quality practices are not uniformly tied to sustainability, highlighting the importance of testing/QA emphasis and mentoring within sustainable projects. The work provides practical guidance for ASFI communities and cautions against using sustainability alone as a proxy for software quality, calling for more nuanced, context-aware approaches and further cross-community validation.
Abstract
Context: Free and Open Source Software (FOSS) communities' ability to stay viable and productive over time is pivotal for society as they maintain the building blocks that digital infrastructure, products, and services depend on. Sustainability may, however, be characterized from multiple aspects, and less is known how these aspects interplay and impact community outputs, and software quality specifically. Objective: This study, therefore, aims to empirically explore how the different aspects of FOSS sustainability impact software quality. Method: 16 sustainability metrics across four categories were sampled and applied to a set of 217 OSS projects sourced from the Apache Software Foundation Incubator program. The impact of a decline in the sustainability metrics was analyzed against eight software quality metrics using Bayesian data analysis, which incorporates probability distributions to represent the regression coefficients and intercepts. Results: Findings suggest that selected sustainability metrics do not significantly affect defect density or code coverage. However, a positive impact of community age was observed on specific code quality metrics, such as risk complexity, number of very large files, and code duplication percentage. Interestingly, findings show that even when communities are experiencing sustainability, certain code quality metrics are negatively impacted. Conclusion: Findings imply that code quality practices are not consistently linked to sustainability, and defect management and prevention may be prioritized over the former. Results suggest that growth, resulting in a more complex and large codebase, combined with a probable lack of understanding of code quality standards, may explain the degradation in certain aspects of code quality.
