Learning to Prioritize IT Tickets: A Comparative Evaluation of Embedding-based Approaches and Fine-Tuned Transformer Models
Minh Tri LÊ, Ali Ait-Bachir
TL;DR
Problem: Prioritize IT service tickets under noisy, multilingual text and severe class imbalance. Approach: compare embedding-based pipelines with a domain-adapted fine-tuned transformer that fuses textual content and numerical features. Findings: embedding methods fail to generalize and underperform, while the transformer achieves strong average F1 (~78.5%) and weighted kappa (~0.80), supporting production-ready prioritization. Significance: demonstrates the limitations of generic embeddings for ITSM and provides a scalable, real-time transformer-based solution for ticket prioritization.
Abstract
Prioritizing service tickets in IT Service Management (ITSM) is critical for operational efficiency but remains challenging due to noisy textual inputs, subjective writing styles, and pronounced class imbalance. We evaluate two families of approaches for ticket prioritization: embedding-based pipelines that combine dimensionality reduction, clustering, and classical classifiers, and a fine-tuned multilingual transformer that processes both textual and numerical features. Embedding-based methods exhibit limited generalization across a wide range of thirty configurations, with clustering failing to uncover meaningful structures and supervised models highly sensitive to embedding quality. In contrast, the proposed transformer model achieves substantially higher performance, with an average F1-score of 78.5% and weighted Cohen's kappa values of nearly 0.80, indicating strong alignment with true labels. These results highlight the limitations of generic embeddings for ITSM data and demonstrate the effectiveness of domain-adapted transformer architectures for operational ticket prioritization.
