DaedalusData: Exploration, Knowledge Externalization and Labeling of Particles in Medical Manufacturing -- A Design Study
Alexander Wyss, Gabriela Morgenshtern, Amanda Hirsch-Hüsler, Jürgen Bernard
TL;DR
DaedalusData tackles the challenge of scalable exploration, labeling, and knowledge externalization of particle contaminants in IVD manufacturing. The authors present a design-study driven visual analytics system that supports attribute-based and label-informed projections, enables labeling with multiple alphabets, and persists knowledge to augment data. Through two Roche case studies and a user study, DaedalusData demonstrates high usability, efficient labeling, and meaningful knowledge externalization that can inform quality control decisions. The work contributes a generalizable framework for knowledge externalization and discusses scalability and trade-offs of human-in-the-loop data augmentation in industrial diagnostics.
Abstract
In medical diagnostics of both early disease detection and routine patient care, particle-based contamination of in-vitro diagnostics consumables poses a significant threat to patients. Objective data-driven decision-making on the severity of contamination is key for reducing patient risk, while saving time and cost in quality assessment. Our collaborators introduced us to their quality control process, including particle data acquisition through image recognition, feature extraction, and attributes reflecting the production context of particles. Shortcomings in the current process are limitations in exploring thousands of images, data-driven decision making, and ineffective knowledge externalization. Following the design study methodology, our contributions are a characterization of the problem space and requirements, the development and validation of DaedalusData, a comprehensive discussion of our study's learnings, and a generalizable framework for knowledge externalization. DaedalusData is a visual analytics system that enables domain experts to explore particle contamination patterns, label particles in label alphabets, and externalize knowledge through semi-supervised label-informed data projections. The results of our case study and user study show high usability of DaedalusData and its efficient support of experts in generating comprehensive overviews of thousands of particles, labeling of large quantities of particles, and externalizing knowledge to augment the dataset further. Reflecting on our approach, we discuss insights on dataset augmentation via human knowledge externalization, and on the scalability and trade-offs that come with the adoption of this approach in practice.
