MultiOrg: A Multi-rater Organoid-detection Dataset
Christina Bukas, Harshavardhan Subramanian, Fenja See, Carina Steinchen, Ivan Ezhov, Gowtham Boosarpu, Sara Asgharpour, Gerald Burgstaller, Mareike Lehmann, Florian Kofler, Marie Piraud
TL;DR
MultiOrg introduces a large, openly available 2D organoid-detection dataset with multi-rater annotations to quantify label uncertainty in biomedical imaging. It provides over 400 high-resolution images and more than 60,000 bounding boxes, annotated by two experts at two time points, plus three test-label sets to study annotation noise. A COCO-formatted benchmark of four detectors (Faster R-CNN, SSD, YOLOv3, RTMDet) demonstrates task difficulty and model trade-offs, with SSD delivering strong mAP@0.5 while revealing label-noise resilience. The authors also release a Napari plugin for interactive quantification and curation, along with Kaggle/Zenodo resources to support reproducibility and uncertainty research. Overall, MultiOrg advances open datasets at the intersection of microscopy and uncertainty quantification, enabling robust, high-throughput organoid quantification and benchmarking across label variability.
Abstract
High-throughput image analysis in the biomedical domain has gained significant attention in recent years, driving advancements in drug discovery, disease prediction, and personalized medicine. Organoids, specifically, are an active area of research, providing excellent models for human organs and their functions. Automating the quantification of organoids in microscopy images would provide an effective solution to overcome substantial manual quantification bottlenecks, particularly in high-throughput image analysis. However, there is a notable lack of open biomedical datasets, in contrast to other domains, such as autonomous driving, and, notably, only few of them have attempted to quantify annotation uncertainty. In this work, we present MultiOrg a comprehensive organoid dataset tailored for object detection tasks with uncertainty quantification. This dataset comprises over 400 high-resolution 2d microscopy images and curated annotations of more than 60,000 organoids. Most importantly, it includes three label sets for the test data, independently annotated by two experts at distinct time points. We additionally provide a benchmark for organoid detection, and make the best model available through an easily installable, interactive plugin for the popular image visualization tool Napari, to perform organoid quantification.
