Learning semantic image quality for fetal ultrasound from noisy ranking annotation
Manxi Lin, Jakob Ambsdorf, Emilie Pi Fogtmann Sejer, Zahra Bashir, Chun Kit Wong, Paraskevas Pegios, Alberto Raheli, Morten Bo Søndergaard Svendsen, Mads Nielsen, Martin Grønnebæk Tolsgaard, Anders Nymark Christensen, Aasa Feragen
TL;DR
The paper tackles semantic image quality in fetal ultrasound by reframing image quality assessment as a ranking problem and introducing ORBNet, a coarse-to-fine ordinal-regression network with uncertainty estimation. It combines a global ordinal ranking over $m$ bins with a local offset to produce a final score $s_i \in [0,1]$, trained via a coarse loss $l_{coarse}$ and a fine pairwise loss $l_{fine}$ that depend on pairwise relations $p_{ij}$. To accommodate noisy annotations, it uses MC Dropout to quantify ranking uncertainty and a merge-sort–based annotation scheme to efficiently obtain full dataset rankings. Empirical results on a 3rd-trimester fetal ultrasound dataset show ORBNet outperforms several baselines on key ranking metrics and demonstrates robustness across cross-validation, highlighting the method’s potential to improve clinical navigation and automated acquisition tasks. The work also discusses limitations related to annotation noise and uncertainty sources, suggesting avenues for future refinement of uncertainty modeling and annotation strategies.
Abstract
We introduce the notion of semantic image quality for applications where image quality relies on semantic requirements. Working in fetal ultrasound, where ranking is challenging and annotations are noisy, we design a robust coarse-to-fine model that ranks images based on their semantic image quality and endow our predicted rankings with an uncertainty estimate. To annotate rankings on training data, we design an efficient ranking annotation scheme based on the merge sort algorithm. Finally, we compare our ranking algorithm to a number of state-of-the-art ranking algorithms on a challenging fetal ultrasound quality assessment task, showing the superior performance of our method on the majority of rank correlation metrics.
