Self-Supervised Ultrasound Screen Detection
Alberto Gomez, Jorge Oliveira, Ramon Casero, Agis Chartsias
TL;DR
The paper tackles the bottleneck of accessing ultrasound data by proposing a self-supervised pipeline that detects US screen content in photographs, corrects perspective, and reconstructs usable frames for downstream analysis. It introduces synthetic, self-annotated data generation and a multi-task CNN that localizes screen corners and predicts screen presence, coupled with a homography-based rectification and basic post-processing. The approach is evaluated on synthetic and real data, showing strong screen-detection performance and reasonable image reconstruction, though real-world performance declines due to reflections and labeling ambiguities. Overall, the method enables rapid testing and prototyping of US-image analysis without hardware changes, potentially accelerating development of real-time imaging workflows.
Abstract
Ultrasound (US) machines display images on a built-in monitor, but routine transfer to hospital systems relies on DICOM. We propose a self-supervised pipeline to extract the US image from a photograph of the monitor. This removes the DICOM bottleneck and enables rapid testing and prototyping of new algorithms. In a proof-of-concept study, the rectified images retained enough visual fidelity to classify cardiac views with a balanced accuracy of 0.79 with respect to the native DICOMs.
