Table of Contents
Fetching ...

DaedalusData: Exploration, Knowledge Externalization and Labeling of Particles in Medical Manufacturing -- A Design Study

Alexander Wyss, Gabriela Morgenshtern, Amanda Hirsch-Hüsler, Jürgen Bernard

TL;DR

DaedalusData tackles the challenge of scalable exploration, labeling, and knowledge externalization of particle contaminants in IVD manufacturing. The authors present a design-study driven visual analytics system that supports attribute-based and label-informed projections, enables labeling with multiple alphabets, and persists knowledge to augment data. Through two Roche case studies and a user study, DaedalusData demonstrates high usability, efficient labeling, and meaningful knowledge externalization that can inform quality control decisions. The work contributes a generalizable framework for knowledge externalization and discusses scalability and trade-offs of human-in-the-loop data augmentation in industrial diagnostics.

Abstract

In medical diagnostics of both early disease detection and routine patient care, particle-based contamination of in-vitro diagnostics consumables poses a significant threat to patients. Objective data-driven decision-making on the severity of contamination is key for reducing patient risk, while saving time and cost in quality assessment. Our collaborators introduced us to their quality control process, including particle data acquisition through image recognition, feature extraction, and attributes reflecting the production context of particles. Shortcomings in the current process are limitations in exploring thousands of images, data-driven decision making, and ineffective knowledge externalization. Following the design study methodology, our contributions are a characterization of the problem space and requirements, the development and validation of DaedalusData, a comprehensive discussion of our study's learnings, and a generalizable framework for knowledge externalization. DaedalusData is a visual analytics system that enables domain experts to explore particle contamination patterns, label particles in label alphabets, and externalize knowledge through semi-supervised label-informed data projections. The results of our case study and user study show high usability of DaedalusData and its efficient support of experts in generating comprehensive overviews of thousands of particles, labeling of large quantities of particles, and externalizing knowledge to augment the dataset further. Reflecting on our approach, we discuss insights on dataset augmentation via human knowledge externalization, and on the scalability and trade-offs that come with the adoption of this approach in practice.

DaedalusData: Exploration, Knowledge Externalization and Labeling of Particles in Medical Manufacturing -- A Design Study

TL;DR

DaedalusData tackles the challenge of scalable exploration, labeling, and knowledge externalization of particle contaminants in IVD manufacturing. The authors present a design-study driven visual analytics system that supports attribute-based and label-informed projections, enables labeling with multiple alphabets, and persists knowledge to augment data. Through two Roche case studies and a user study, DaedalusData demonstrates high usability, efficient labeling, and meaningful knowledge externalization that can inform quality control decisions. The work contributes a generalizable framework for knowledge externalization and discusses scalability and trade-offs of human-in-the-loop data augmentation in industrial diagnostics.

Abstract

In medical diagnostics of both early disease detection and routine patient care, particle-based contamination of in-vitro diagnostics consumables poses a significant threat to patients. Objective data-driven decision-making on the severity of contamination is key for reducing patient risk, while saving time and cost in quality assessment. Our collaborators introduced us to their quality control process, including particle data acquisition through image recognition, feature extraction, and attributes reflecting the production context of particles. Shortcomings in the current process are limitations in exploring thousands of images, data-driven decision making, and ineffective knowledge externalization. Following the design study methodology, our contributions are a characterization of the problem space and requirements, the development and validation of DaedalusData, a comprehensive discussion of our study's learnings, and a generalizable framework for knowledge externalization. DaedalusData is a visual analytics system that enables domain experts to explore particle contamination patterns, label particles in label alphabets, and externalize knowledge through semi-supervised label-informed data projections. The results of our case study and user study show high usability of DaedalusData and its efficient support of experts in generating comprehensive overviews of thousands of particles, labeling of large quantities of particles, and externalizing knowledge to augment the dataset further. Reflecting on our approach, we discuss insights on dataset augmentation via human knowledge externalization, and on the scalability and trade-offs that come with the adoption of this approach in practice.
Paper Structure (48 sections, 10 figures, 2 tables)

This paper contains 48 sections, 10 figures, 2 tables.

Figures (10)

  • Figure 1: The overview of data abstractions reveals the added value of DaedalusData: while we carefully characterized 12 attributes during design (top), experts can further augment particle data through online interactive labeling (bottom). Multiple label alphabets can be re-played as augmented attributes, informing the positioning of particles for their semantic exploration. Top (design phase): The data abstraction included Image Context, with nine numerical attributes on the size and shape of particles, computationally derived from images; in addition, the Production Context offers three categorical/ordinal attributes contextualizing the particle origin. Bottom (application phase): the Label Context is created by experts using DaedalusData, augmenting the dataset by labeling new attributes with rich domain semantics, expressed through multiple label alphabets.
  • Figure 2: The Attribute View provides an overview of particles (R1), structured by a user-selectable attribute, easing the comparison of particles between attribute levels (categories, bins for numerical attributes). Here, the expert chose the Lot Number and zoomed toward two lots, for a detailed inspection. The expert makes an interesting observation: they identified that many of the 669 particles of Lot 027 (right) appear to be blue, compared to the orange tone of Lot 005 (left).
  • Figure 3: The Projection View provides a comprehensive similarity-preserving particle overview (R2), leveraging dimensionality reduction. Experts have full flexibility in the selection of attributes, triggering re-computation. Adding an augmented expert-labeled attribute creates a Label-Informed Projection, representing added domain semantics (R7). In this example, the expert combined image attributes with the Lot Number. The projection shows an almost symmetric structure (left vs. right), representing two lots of high similarity opposite each other.
  • Figure 4: Visual customization in practice (R4). These three examples show possible Canvas customization support within DaedalusData(R4). Customization helps experts mitigate overplotting: experts can declutter their Canvas by toggling between relative and absolute particle sizes, and calibrating how much the image background is obscured.
  • Figure 5: Projection View of thousands of images, with the Filter View (R3) on the right. Expert-selected filters for Lot Number (top) and Supplier (bottom) narrow down the search space. The Canvas encodes which particles have been filtered out in the projection by coloring their images light gray. In dark gray, the bar chart encodes the exclusion of suppliers "D", "E", "F", "G". In red, the bars encode the number of particles filtered by the other attribute, alerting the expert to possible relationships between selected filters.
  • ...and 5 more figures