Table of Contents
Fetching ...

Private, Efficient and Scalable Kernel Learning for Medical Image Analysis

Anika Hannemann, Arjhun Swaminathan, Ali Burak Ünal, Mete Akgün

TL;DR

This work tackles privacy constraints and high dimensionality in kernel learning for distributed medical images. It introduces OKRA, an orthonormal k-frame–based randomized encoding method that enables private, one-shot federated kernel computations for $K_{lin}$, $K_{RBF}$, $K_{poly}$, and $K_{RQ}$ without revealing raw data. The authors provide a formal privacy analysis for semi-honest, non-colluding settings and demonstrate, across MRI and blood cell datasets, that OKRA achieves accuracy comparable to centralized methods while reducing computation and communication costs relative to prior randomized encoding approaches. These results suggest that OKRA can enable scalable, privacy-preserving multi-site kernel learning in clinical pipelines, reducing data-sharing barriers while maintaining performance advantages of kernel methods.

Abstract

Medical imaging is key in modern medicine. From magnetic resonance imaging (MRI) to microscopic imaging for blood cell detection, diagnostic medical imaging reveals vital insights into patient health. To predict diseases or provide individualized therapies, machine learning techniques like kernel methods have been widely used. Nevertheless, there are multiple challenges for implementing kernel methods. Medical image data often originates from various hospitals and cannot be combined due to privacy concerns, and the high dimensionality of image data presents another significant obstacle. While randomised encoding offers a promising direction, existing methods often struggle with a trade-off between accuracy and efficiency. Addressing the need for efficient privacy-preserving methods on distributed image data, we introduce OKRA (Orthonormal K-fRAmes), a novel randomized encoding-based approach for kernel-based machine learning. This technique, tailored for widely used kernel functions, significantly enhances scalability and speed compared to current state-of-the-art solutions. Through experiments conducted on various clinical image datasets, we evaluated model quality, computational performance, and resource overhead. Additionally, our method outperforms comparable approaches

Private, Efficient and Scalable Kernel Learning for Medical Image Analysis

TL;DR

This work tackles privacy constraints and high dimensionality in kernel learning for distributed medical images. It introduces OKRA, an orthonormal k-frame–based randomized encoding method that enables private, one-shot federated kernel computations for , , , and without revealing raw data. The authors provide a formal privacy analysis for semi-honest, non-colluding settings and demonstrate, across MRI and blood cell datasets, that OKRA achieves accuracy comparable to centralized methods while reducing computation and communication costs relative to prior randomized encoding approaches. These results suggest that OKRA can enable scalable, privacy-preserving multi-site kernel learning in clinical pipelines, reducing data-sharing barriers while maintaining performance advantages of kernel methods.

Abstract

Medical imaging is key in modern medicine. From magnetic resonance imaging (MRI) to microscopic imaging for blood cell detection, diagnostic medical imaging reveals vital insights into patient health. To predict diseases or provide individualized therapies, machine learning techniques like kernel methods have been widely used. Nevertheless, there are multiple challenges for implementing kernel methods. Medical image data often originates from various hospitals and cannot be combined due to privacy concerns, and the high dimensionality of image data presents another significant obstacle. While randomised encoding offers a promising direction, existing methods often struggle with a trade-off between accuracy and efficiency. Addressing the need for efficient privacy-preserving methods on distributed image data, we introduce OKRA (Orthonormal K-fRAmes), a novel randomized encoding-based approach for kernel-based machine learning. This technique, tailored for widely used kernel functions, significantly enhances scalability and speed compared to current state-of-the-art solutions. Through experiments conducted on various clinical image datasets, we evaluated model quality, computational performance, and resource overhead. Additionally, our method outperforms comparable approaches

Paper Structure

This paper contains 24 sections, 3 theorems, 9 equations, 3 figures, 1 table.

Key Result

theorem thmcountertheorem

Given $A'$, $B'$ and $C'$, the central server can correctly compute the Linear, Gaussian, the Polynomial and the Rational Quadratic Kernels for the distributed input data.

Figures (3)

  • Figure 1: Overview of OKRA.
  • Figure 2: Runtime comparison with increasing participants.
  • Figure 3: Encoding times against image sizes.

Theorems & Definitions (7)

  • definition thmcounterdefinition
  • theorem thmcountertheorem
  • proof
  • theorem thmcountertheorem
  • proof
  • theorem thmcountertheorem
  • proof