Multi-Modal Dataset Creation for Federated Learning with DICOM Structured Reports
Malte Tölle, Lukas Burger, Halvar Kelm, Florian André, Peter Bannas, Gerhard Diller, Norbert Frey, Philipp Garthe, Stefan Groß, Anja Hennemuth, Lars Kaderali, Nina Krüger, Andreas Leha, Simon Martin, Alexander Meyer, Eike Nagel, Stefan Orwat, Clemens Scherer, Moritz Seiffert, Jan Moritz Seliger, Stefan Simm, Tim Friede, Tim Seidler, Sandy Engelhardt
TL;DR
The paper tackles the difficulty of assembling large, heterogeneous, multi-modal medical datasets under privacy constraints by leveraging DICOM Structured Reports (SR) to link imaging data with diverse non-imaging information. It presents an open platform for data integration, matching, and cohort filtering that uses SRs and highdicom to harmonize data across eight German university hospitals, enabling concurrent filtering and preparation of datasets for federated learning (FL) on TAVI outcome prediction. Key contributions include four platform requirements, SR-based data representation across modalities, a flexible integration with or without Kaapana, and a demonstrated federated data export workflow that yields rich, multi-modal cohorts for FL. The work demonstrates the practical feasibility of building harmonized, cross-site datasets using SRs, highlighting the potential to accelerate privacy-preserving multi-modal clinical AI with open tooling.
Abstract
Purpose: Federated training is often hindered by heterogeneous datasets due to divergent data storage options, inconsistent naming schemes, varied annotation procedures, and disparities in label quality. This is particularly evident in the emerging multi-modal learning paradigms, where dataset harmonization including a uniform data representation and filtering options are of paramount importance. Methods: DICOM structured reports enable the standardized linkage of arbitrary information beyond the imaging domain and can be used within Python deep learning pipelines with highdicom. Building on this, we developed an open platform for data integration and interactive filtering capabilities that simplifies the process of assembling multi-modal datasets. Results: In this study, we extend our prior work by showing its applicability to more and divergent data types, as well as streamlining datasets for federated training within an established consortium of eight university hospitals in Germany. We prove its concurrent filtering ability by creating harmonized multi-modal datasets across all locations for predicting the outcome after minimally invasive heart valve replacement. The data includes DICOM data (i.e. computed tomography images, electrocardiography scans) as well as annotations (i.e. calcification segmentations, pointsets and pacemaker dependency), and metadata (i.e. prosthesis and diagnoses). Conclusion: Structured reports bridge the traditional gap between imaging systems and information systems. Utilizing the inherent DICOM reference system arbitrary data types can be queried concurrently to create meaningful cohorts for clinical studies. The graphical interface as well as example structured report templates will be made publicly available.
