Detecting Interpretable Subgroup Drifts
Flavio Giobergia, Eliana Pastor, Luca de Alfaro, Elena Baralis
TL;DR
DriftInspector tackles the limitation of global drift monitoring by enabling fine-grained detection of performance drifts within interpretable data subgroups. It identifies frequent subgroups during training, constructs efficient sparse representations to monitor subgroup performance over time, and detects drift using a per-subgroup Welch's t-test on a Bayesianized performance metric $h(X)$. The approach demonstrates superior sensitivity to small subgroups, provides interpretable drift summaries, and maintains strong performance for global drifts, all with favorable computational efficiency. This enables targeted model maintenance and fairness-aware interventions in dynamic environments where subgroup behavior evolves independently of the overall population.
Abstract
The ability to detect and adapt to changes in data distributions is crucial to maintain the accuracy and reliability of machine learning models. Detection is generally approached by observing the drift of model performance from a global point of view. However, drifts occurring in (fine-grained) data subgroups may go unnoticed when monitoring global drift. We take a different perspective, and introduce methods for observing drift at the finer granularity of subgroups. Relevant data subgroups are identified during training and monitored efficiently throughout the model's life. Performance drifts in any subgroup are detected, quantified and characterized so as to provide an interpretable summary of the model behavior over time. Experimental results confirm that our subgroup-level drift analysis identifies drifts that do not show at the (coarser) global dataset level. The proposed approach provides a valuable tool for monitoring model performance in dynamic real-world applications, offering insights into the evolving nature of data and ultimately contributing to more robust and adaptive models.
