The State of the Art in Enhancing Trust in Machine Learning Models with the Use of Visualizations
A. Chatzimparmpas, R. Martins, I. Jusufi, K. Kucher, Fabrice Rossi, A. Kerren
TL;DR
This work addresses the challenge of establishing trust in ML systems by synthesizing visualizations that enhance understanding and trust across data, algorithms, and outcomes. It introduces a fine-grained, multi-level taxonomy (TL1–TL5) linking data, processing, learning methods, concrete models, and evaluation to visualization techniques, complemented by empirical analyses and a public TrustMLVis browser. Through topic modeling, correlation analyses, and data-set investigations across 200 papers, the STAR reveals trends, gaps, and opportunities for visualization to improve trust, including uncertainty awareness, fairness, and in-situ model comparisons. The work provides a practical roadmap for researchers and practitioners to design trust-enhancing visualizations and to prioritize underexplored areas, with the TrustMLVis browser enabling ongoing, community-driven exploration and extension.
Abstract
Machine learning (ML) models are nowadays used in complex applications in various domains, such as medicine, bioinformatics, and other sciences. Due to their black box nature, however, it may sometimes be hard to understand and trust the results they provide. This has increased the demand for reliable visualization tools related to enhancing trust in ML models, which has become a prominent topic of research in the visualization community over the past decades. To provide an overview and present the frontiers of current research on the topic, we present a State-of-the-Art Report (STAR) on enhancing trust in ML models with the use of interactive visualization. We define and describe the background of the topic, introduce a categorization for visualization techniques that aim to accomplish this goal, and discuss insights and opportunities for future research directions. Among our contributions is a categorization of trust against different facets of interactive ML, expanded and improved from previous research. Our results are investigated from different analytical perspectives: (a) providing a statistical overview, (b) summarizing key findings, (c) performing topic analyses, and (d) exploring the data sets used in the individual papers, all with the support of an interactive web-based survey browser. We intend this survey to be beneficial for visualization researchers whose interests involve making ML models more trustworthy, as well as researchers and practitioners from other disciplines in their search for effective visualization techniques suitable for solving their tasks with confidence and conveying meaning to their data.
