Exploratory Visual Analysis for Increasing Data Readiness in Artificial Intelligence Projects
Mattias Tiger, Daniel Jakobsson, Anders Ynnerman, Fredrik Heintz, Daniel Jönsson
TL;DR
This work addresses raising data readiness for AI through integrated visualization, extending the data readiness concept to time-varying and text data and formalizing a mapping from data-readiness questions to simple visual analyses. It introduces an extended A-B-C band framework with seven $A$-aspects to connect data, task, and solution considerations, and presents minimalist visualization guidelines that support data profiling and stakeholder communication. The approach is demonstrated via multi-year case studies, showing how visual analysis uncovers data issues, aids decisions on data collection, and informs model adaptation and deployment readiness. The results highlight practical benefits for data-centric AI workflows and point to future work on extending the guidelines to classification tasks and more integrated visualization environments.
Abstract
We present experiences and lessons learned from increasing data readiness of heterogeneous data for artificial intelligence projects using visual analysis methods. Increasing the data readiness level involves understanding both the data as well as the context in which it is used, which are challenges well suitable to visual analysis. For this purpose, we contribute a mapping between data readiness aspects and visual analysis techniques suitable for different data types. We use the defined mapping to increase data readiness levels in use cases involving time-varying data, including numerical, categorical, and text. In addition to the mapping, we extend the data readiness concept to better take aspects of the task and solution into account and explicitly address distribution shifts during data collection time. We report on our experiences in using the presented visual analysis techniques to aid future artificial intelligence projects in raising the data readiness level.
