Table of Contents
Fetching ...

Capturing Requirements for a Data Annotation Tool for Intensive Care: Experimental User-Centered Design Study

Marceli Wac, Raul Santos-Rodriguez, Chris McWilliams, Christopher Bourdeaux

TL;DR

This paper tackles the challenge of annotating ICU time-series data for machine learning by applying a user-centered design approach. Clinicians manually annotated printed ICU data and were analyzed through Norman's Interaction Cycle to identify how they approach labeling and where tool support is needed. The study derives an 11-item requirements set, including five for single-admission annotation, three for semi-automated workflows, two operational constraints, and one ML-context requirement, to guide the development of a dedicated ICU data annotation platform. The results emphasize flexibility, continuous annotation–evaluation loops, asynchronous remote access, and robust metadata to support multi-annotator labeling, aiming to improve annotation efficiency and data quality for downstream ML tasks. The work highlights practical steps for designing and trialing a domain-specific annotation tool in intensive care settings, with plans for broader validation and real-world testing.

Abstract

Intensive care units (ICUs) are complex and data-rich environments. Data routinely collected in the ICUs provides tremendous opportunities for machine learning, but their use comes with significant challenges. Complex problems may require additional input from humans which can be provided through a process of data annotation. Annotation is a complex, time-consuming process that requires domain expertise and technical proficiency. Existing data annotation tools fail to provide an effective solution to this problem. In this study, we investigated clinicians' approach to the annotation task. We focused on establishing the characteristics of the annotation process in the context of clinical data and identifying differences in the annotation workflow between different staff roles. The overall goal was to elicit requirements for a software tool that could facilitate an effective and time-efficient data annotation. We conducted an experiment involving clinicians from the ICUs annotating printed sheets of data. The participants were observed during the task and their actions were analysed in the context of Norman's Interaction Cycle to establish the requirements for the digital tool. The annotation process followed a constant loop of annotation and evaluation, during which participants incrementally analysed and annotated the data. No distinguishable differences were identified between how different staff roles annotate data. We observed preferences towards different methods for applying annotation which varied between different participants and admissions. We established 11 requirements for the digital data annotation tool for the healthcare setting. We conducted a manual data annotation activity to establish the requirements for a digital data annotation tool, characterised the clinicians' approach to annotation and elicited 11 key requirements for effective data annotation software.

Capturing Requirements for a Data Annotation Tool for Intensive Care: Experimental User-Centered Design Study

TL;DR

This paper tackles the challenge of annotating ICU time-series data for machine learning by applying a user-centered design approach. Clinicians manually annotated printed ICU data and were analyzed through Norman's Interaction Cycle to identify how they approach labeling and where tool support is needed. The study derives an 11-item requirements set, including five for single-admission annotation, three for semi-automated workflows, two operational constraints, and one ML-context requirement, to guide the development of a dedicated ICU data annotation platform. The results emphasize flexibility, continuous annotation–evaluation loops, asynchronous remote access, and robust metadata to support multi-annotator labeling, aiming to improve annotation efficiency and data quality for downstream ML tasks. The work highlights practical steps for designing and trialing a domain-specific annotation tool in intensive care settings, with plans for broader validation and real-world testing.

Abstract

Intensive care units (ICUs) are complex and data-rich environments. Data routinely collected in the ICUs provides tremendous opportunities for machine learning, but their use comes with significant challenges. Complex problems may require additional input from humans which can be provided through a process of data annotation. Annotation is a complex, time-consuming process that requires domain expertise and technical proficiency. Existing data annotation tools fail to provide an effective solution to this problem. In this study, we investigated clinicians' approach to the annotation task. We focused on establishing the characteristics of the annotation process in the context of clinical data and identifying differences in the annotation workflow between different staff roles. The overall goal was to elicit requirements for a software tool that could facilitate an effective and time-efficient data annotation. We conducted an experiment involving clinicians from the ICUs annotating printed sheets of data. The participants were observed during the task and their actions were analysed in the context of Norman's Interaction Cycle to establish the requirements for the digital tool. The annotation process followed a constant loop of annotation and evaluation, during which participants incrementally analysed and annotated the data. No distinguishable differences were identified between how different staff roles annotate data. We observed preferences towards different methods for applying annotation which varied between different participants and admissions. We established 11 requirements for the digital data annotation tool for the healthcare setting. We conducted a manual data annotation activity to establish the requirements for a digital data annotation tool, characterised the clinicians' approach to annotation and elicited 11 key requirements for effective data annotation software.
Paper Structure (35 sections, 4 figures)

This paper contains 35 sections, 4 figures.

Figures (4)

  • Figure 1: Annotating data can provide additional information and context necessary to train machine learning models.
  • Figure 2: Participants were asked to annotate data on the printed out sheets resembling the interface of the existing clinical system present at their ICU. The data was formatted in a table and contained parameters in each row and columns aggregating their values for each hour (depicted table was trimmed for conciseness).
  • Figure 3: Norman's Interaction Cycle follows a 7-step approach to analysing the interaction of a user with technology (adapted from norman_design_2013).
  • Figure 4: Participants annotated data in a variety of ways using pen and paper during the workshop held in the ICU.