Perspective of Software Engineering Researchers on Machine Learning Practices Regarding Research, Review, and Education
Anamaria Mojica-Hanke, David Nader Palacio, Denys Poshyvanyk, Mario Linares-Vásquez, Steffen Herbold
TL;DR
The paper probes how SE researchers view and apply ML in SE, addressing gaps left by practitioner-focused work. Through grounded-theory analysis of SE articles, author surveys, and expert interviews, it uncovers prevalent data handling, model training, and evaluation practices, while revealing underutilized areas such as hyperparameter tuning and human-in-the-loop evaluation. It also identifies gaps in guidelines for reviewing ML-in-SE work and for teaching ML to SE audiences, emphasizing non-functional properties and qualitative assessments. The findings suggest a need for SE-specific guidelines, broader teaching approaches, and greater emphasis on underrepresented practices to bridge the gap between declared best practices and actual usage. Overall, the study offers a multi-perspective view that can inform guidelines, pedagogy, and future research in ML4SE and SE4ML.
Abstract
Context: Machine Learning (ML) significantly impacts Software Engineering (SE), but studies mainly focus on practitioners, neglecting researchers. This overlooks practices and challenges in teaching, researching, or reviewing ML applications in SE. Objective: This study aims to contribute to the knowledge, about the synergy between ML and SE from the perspective of SE researchers, by providing insights into the practices followed when researching, teaching, and reviewing SE studies that apply ML. Method: We analyzed SE researchers familiar with ML or who authored SE articles using ML, along with the articles themselves. We examined practices, SE tasks addressed with ML, challenges faced, and reviewers' and educators' perspectives using grounded theory coding and qualitative analysis. Results: We found diverse practices focusing on data collection, model training, and evaluation. Some recommended practices (e.g., hyperparameter tuning) appeared in less than 20\% of literature. Common challenges involve data handling, model evaluation (incl. non-functional properties), and involving human expertise in evaluation. Hands-on activities are common in education, though traditional methods persist. Conclusion: Despite accepted practices in applying ML to SE, significant gaps remain. By enhancing guidelines, adopting diverse teaching methods, and emphasizing underrepresented practices, the SE community can bridge these gaps and advance the field.
