GazeSearch: Radiology Findings Search Benchmark
Trong Thang Pham, Tien-Phat Nguyen, Yuki Ikebe, Akash Awasthi, Zhigang Deng, Carol C. Wu, Hien Nguyen, Ngan Le
TL;DR
This work tackles the misalignment between radiologists' gaze and radiology findings by introducing GazeSearch, a dataset that converts free-view eye-tracking data into finding-aware visual search sequences for chest X-rays. It then proposes ChestSearch, a transformer-based scanpath predictor pretrained with self-supervised radiology features and guided by a query mechanism to predict subsequent fixations, durations, and termination. The authors demonstrate that GazeSearch enables meaningful modeling of medical visual search and that ChestSearch achieves state-of-the-art alignment with radiologist-like gaze across multiple metrics, offering a solid benchmark for future medical visual search research. Overall, the approach enhances interpretability and trust in AI-assisted radiology by aligning AI attention with expert human gaze and providing a robust evaluation framework.
Abstract
Medical eye-tracking data is an important information source for understanding how radiologists visually interpret medical images. This information not only improves the accuracy of deep learning models for X-ray analysis but also their interpretability, enhancing transparency in decision-making. However, the current eye-tracking data is dispersed, unprocessed, and ambiguous, making it difficult to derive meaningful insights. Therefore, there is a need to create a new dataset with more focus and purposeful eyetracking data, improving its utility for diagnostic applications. In this work, we propose a refinement method inspired by the target-present visual search challenge: there is a specific finding and fixations are guided to locate it. After refining the existing eye-tracking datasets, we transform them into a curated visual search dataset, called GazeSearch, specifically for radiology findings, where each fixation sequence is purposefully aligned to the task of locating a particular finding. Subsequently, we introduce a scan path prediction baseline, called ChestSearch, specifically tailored to GazeSearch. Finally, we employ the newly introduced GazeSearch as a benchmark to evaluate the performance of current state-of-the-art methods, offering a comprehensive assessment for visual search in the medical imaging domain. Code is available at \url{https://github.com/UARK-AICV/GazeSearch}.
