On the Automated Processing of User Feedback
Walid Maalej, Volodymyr Biryuk, Jialiang Wei, Fabian Panse
TL;DR
This chapter tackles the challenge of leveraging the growing volume and uneven quality of user feedback for software requirements engineering. It presents a comprehensive, pipeline-oriented approach that combines preprocessing, classification, summarisation, and artefact matching, guided by NLP/ML techniques and, increasingly, Large Language Models. Key contributions include a structured discussion of data quality management, demographic annotation, and feedback augmentation with implicit data to improve analysis fidelity; it also contrasts vertical and horizontal classification and demonstrates practical, model-based strategies for real-world applicability. The work concludes with guidance for practitioners and researchers, supported by computational notebooks to reproduce and adapt the methods, and emphasizes the need for analyst-in-the-loop in clustering and summarisation to achieve reliable, actionable insights.
Abstract
User feedback is becoming an increasingly important source of information for requirements engineering, user interface design, and software engineering in general. Nowadays, user feedback is largely available and easily accessible in social media, product forums, or app stores. Over the last decade, research has shown that user feedback can help software teams: a) better understand how users are actually using specific product features and components, b) faster identify, reproduce, and fix defects, and b) get inspirations for improvements or new features. However, to tap the full potential of feedback, there are two main challenges that need to be solved. First, software vendors must cope with a large quantity of feedback data, which is hard to manage manually. Second, vendors must also cope with a varying quality of feedback as some items might be uninformative, repetitive, or simply wrong. This chapter summarises and pipelines various data mining, machine learning, and natural language processing techniques, including recent Large Language Models, to cope with the quantity and quality challenges. We guide researchers and practitioners through implementing effective, actionable analysis of user feedback for software and requirements engineering.
