Classification of User Reports for Detection of Faulty Computer Components using NLP Models: A Case Study
Maria de Lourdes M. Silva, André L. C. Mendonça, Eduardo R. D. Neto, Iago C. Chaves, Felipe T. Brito, Victor A. E. Farias, Javam C. Machado
TL;DR
The paper tackles the challenge of leveraging user-generated fault reports to detect faulty computer components through NLP. It constructs a labeled dataset of 341 reports across eight component classes and evaluates zero-shot, one-shot, and few-shot transformer-based models (including BART, DeBERTa variants, MiniLM, and MPNet). The study finds that few-shot 6MLM achieves the best performance with an accuracy of $0.79$ (and F1 of $0.79$), illustrating that limited retraining data can yield strong generalization in this task. The work demonstrates the viability of integrating NLP-based fault classification into diagnostic workflows and proposes future directions for online assistance, audio extension, and privacy-aware deployments.
Abstract
Computer manufacturers typically offer platforms for users to report faults. However, there remains a significant gap in these platforms' ability to effectively utilize textual reports, which impedes users from describing their issues in their own words. In this context, Natural Language Processing (NLP) offers a promising solution, by enabling the analysis of user-generated text. This paper presents an innovative approach that employs NLP models to classify user reports for detecting faulty computer components, such as CPU, memory, motherboard, video card, and more. In this work, we build a dataset of 341 user reports obtained from many sources. Additionally, through extensive experimental evaluation, our approach achieved an accuracy of 79% with our dataset.
