A cautionary tale on the cost-effectiveness of collaborative AI in real-world medical applications
Francesco Cremonesi, Lucia Innocenti, Sebastien Ourselin, Vicky Goh, Michela Antonelli, Marco Lorenzi
TL;DR
This study addresses the practical evaluation of collaborative AI in healthcare by comparing six FL methods with five CBL approaches across seven diverse medical datasets and tasks. It demonstrates that consensus-based learning can match Federated Learning in accuracy while delivering substantial cost reductions in training time (about 15x) and network bandwidth (about 60x), enabling more sustainable and accessible collaborations. Although neither paradigm consistently outperforms the other, CBL's asynchronous, modular nature reduces deployment complexity and hardware demands, suggesting a practical pathway for real-world adoption. The work also highlights privacy considerations and calls for quantitative system-level metrics, including energy and CO2 implications, to guide future deployments of collaborative AI in medicine.
Abstract
Background. Federated learning (FL) has gained wide popularity as a collaborative learning paradigm enabling collaborative AI in sensitive healthcare applications. Nevertheless, the practical implementation of FL presents technical and organizational challenges, as it generally requires complex communication infrastructures. In this context, consensus-based learning (CBL) may represent a promising collaborative learning alternative, thanks to the ability of combining local knowledge into a federated decision system, while potentially reducing deployment overhead. Methods. In this work we propose an extensive benchmark of the accuracy and cost-effectiveness of a panel of FL and CBL methods in a wide range of collaborative medical data analysis scenarios. The benchmark includes 7 different medical datasets, encompassing 3 machine learning tasks, 8 different data modalities, and multi-centric settings involving 3 to 23 clients. Findings. Our results reveal that CBL is a cost-effective alternative to FL. When compared across the panel of medical dataset in the considered benchmark, CBL methods provide equivalent accuracy to the one achieved by FL.Nonetheless, CBL significantly reduces training time and communication cost (resp. 15 fold and 60 fold decrease) (p < 0.05). Interpretation. This study opens a novel perspective on the deployment of collaborative AI in real-world applications, whereas the adoption of cost-effective methods is instrumental to achieve sustainability and democratisation of AI by alleviating the need for extensive computational resources.
