Table of Contents
Fetching ...

Improving Clinical Imaging Systems using Cognition based Approaches

Kailas Dayanandan, Brejesh Lall

TL;DR

The paper tackles the problem of safely integrating AI into clinical imaging by proposing a cognition-based approach that mirrors radiologists’ analytical workflows. It presents a think-along system that replicates the ABCDE regional analysis, leveraging deep learning to reveal context around affected areas through an effective receptive field, thereby supporting System 2 deliberation. Through qualitative clinician interviews and extrinsic datasets (VQA-RAD, MIMIC-CXR, VinDr-CXR), it identifies hard-to-diagnose diseases and demonstrates how context-aware machine diagnoses can reduce inattentional blindness and supervision burden. The findings suggest design guidelines for creating complementary AI that enhances diagnostic accuracy and efficiency in real-world settings, with practical implications for rural and resource-limited environments. Overall, the work advances human–AI collaboration in medical imaging by coupling cognitive insights with context-rich visual explanations to support clinician supervision rather than replace it.

Abstract

Clinical systems operate in safety-critical environments and are not intended to function autonomously; however, they are currently designed to replicate clinicians' diagnoses rather than assist them in the diagnostic process. To enable better supervision of system-generated diagnoses, we replicate radiologists' systematic approach used to analyze chest X-rays. This approach facilitates comprehensive analysis across all regions of clinical images and can reduce errors caused by inattentional blindness and under reading. Our work addresses a critical research gap by identifying difficult-to-diagnose diseases for clinicians using insights from human vision, enabling these systems to serve as an effective "second pair of eyes". These improvements make the clinical imaging systems more complementary and combine the strengths of human and machine vision. Additionally, we leverage effective receptive fields in deep learning models to present machine-generated diagnoses with sufficient context, making it easier for clinicians to evaluate them.

Improving Clinical Imaging Systems using Cognition based Approaches

TL;DR

The paper tackles the problem of safely integrating AI into clinical imaging by proposing a cognition-based approach that mirrors radiologists’ analytical workflows. It presents a think-along system that replicates the ABCDE regional analysis, leveraging deep learning to reveal context around affected areas through an effective receptive field, thereby supporting System 2 deliberation. Through qualitative clinician interviews and extrinsic datasets (VQA-RAD, MIMIC-CXR, VinDr-CXR), it identifies hard-to-diagnose diseases and demonstrates how context-aware machine diagnoses can reduce inattentional blindness and supervision burden. The findings suggest design guidelines for creating complementary AI that enhances diagnostic accuracy and efficiency in real-world settings, with practical implications for rural and resource-limited environments. Overall, the work advances human–AI collaboration in medical imaging by coupling cognitive insights with context-rich visual explanations to support clinician supervision rather than replace it.

Abstract

Clinical systems operate in safety-critical environments and are not intended to function autonomously; however, they are currently designed to replicate clinicians' diagnoses rather than assist them in the diagnostic process. To enable better supervision of system-generated diagnoses, we replicate radiologists' systematic approach used to analyze chest X-rays. This approach facilitates comprehensive analysis across all regions of clinical images and can reduce errors caused by inattentional blindness and under reading. Our work addresses a critical research gap by identifying difficult-to-diagnose diseases for clinicians using insights from human vision, enabling these systems to serve as an effective "second pair of eyes". These improvements make the clinical imaging systems more complementary and combine the strengths of human and machine vision. Additionally, we leverage effective receptive fields in deep learning models to present machine-generated diagnoses with sufficient context, making it easier for clinicians to evaluate them.

Paper Structure

This paper contains 18 sections, 6 figures, 5 tables.

Figures (6)

  • Figure 1: (a) Navon Dataset has images with a large letter rendered in small copies of some other letter navon1977forest and adapted with rotation at angles between -45 and 45 degrees hermann2020origins (b) Considerable differences exist based on formation gerlach2018navon (c) An example from Human Confusion Dataset explaining the dual thinking framework where the cover on table is wrongly inferred first as a cat dayanandan2024dual. Dual thinking framework is being studied in perceptual dayanandan2024dual and electro-physiological studies vanrullen2007powergrootswagers2019representationalkreiman2020beyondthorpe1996speedvan2020goingtang2018recurrent (d) Human vision does not require complete information and the image shows this perceptual abstraction decarlo2002stylization (e) Human vision depends on shape whereas deep learning models rely on texture shown in an example from Stylized ImageNet Dataset (SIN) geirhos2018imagenet (f) Focus on texture can help in identifying objects that are difficult for human vision using an example from Human Confusion Dataset dayanandan2024dual (g) Deep learning models also takes into account other features that can improve the accuracy which though affects generalization can be helpful in medical imaging to use more features beery2018recognition (h) The deep learning models are prone to errors that human vision are not prone to (an example from ImageNet-A) hendrycks2021natural (i) Deep learning models are prone to adversarial attacks which are not likely to be present in safety critical clinical setting goodfellow2014explaining.
  • Figure 2: Radiology workflow from the clinician to final patient care. In our user trials, we observe that the preview, report, and discussion do not happen in typical rural settings. The steps in red are essential steps, while those with orange border may not be present.
  • Figure 3: First row contains the original x-ray and various regions generated and shown on the x-ray, while second row shows the masks generated for different regions in ABCDE approach. We use a random x-ray from VinDr-CXR dataset (a) Original image (b) Airway consisting of trachea and mediastinal width (e) Breathing showing lung fields. This also identifies relevant region to be show for evaluating cardiomegaly in Fig.\ref{['fig:abcde_process']} (c) Left Lobe (d) Right Lobe (f) Circulation showing the heart and related regions (g) Diaphragm
  • Figure 4: ABCDE process ensures that doctors analyze each region in detail thim2012initial. In our user trials, we show an example from the VinDr-CXR dataset with aortic enlargement and cardiomegaly in the circulation region. Initially, the circulation region in yellow in the thumbnail view is expanded and shown to ensure a detailed analysis (System 2) of the affected areas. Both aortic enlargement and cardiomegaly require additional context for proper assessment (Fig.\ref{['fig:cardio']} and \ref{['fig:zoom-eval']}). Upon selecting these diseases, the view expands to include the thoracic region (Fig.\ref{['fig:b1-abcde']}) as observed in our analysis using deep learning methods while excluding unrelated parts of the image as shown in the figure. (a) Original Image (b) Regions corresponding to ABCDE is able to cover the entire region and brings focus on all parts of thoracic region (c) Circulation Region (d) Circulation region expanded in ABCDE workflow (e) Certain diseases are better evaluated by observing entire thoracic region. For example. A cardiothoracic ratio $\geq$ 0.50 indicates cardiomegaly, where cardio-thoracic ratio is $\frac{MRD+MLD}{ID}$. Image Source (Wikipedia). (f) Anticipating the diseases based on their characteristics can help in human machine interaction. Cardiomegaly requires larger context but other diseases nodules require to be zoomed for easier recognition. Computational model of human visual can help determine additional context for verification. Screenshots are present in supplementary data.
  • Figure 5: Some examples of likely to be missed out diseases (a,b) Pneumothorax in apice from VinDr-CXR dataset (c) Pneumothorax in the border from VinDr-CXR dataset (d) Pneumoperitoneum is abnormal presence of air or other gas in the peritoneal cavity Image Source (Wikipedia). (d) A massive left pleural effusion displacing the heart and trachea to the right in case of mediastinal (compartment or the thoracic cavity between the pleural sacs of the right and left lungs) shift Image Source (Wikipedia).
  • ...and 1 more figures