Table of Contents
Fetching ...

In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models

Hu Wang, Ibrahim Almakky, Congbo Ma, Numan Saeed, Mohammad Yaqub

TL;DR

This work tackles robustness in medical imaging classification by introducing InMerge, a single-model, finetuning-based kernel-merging strategy that reduces intra-model kernel redundancy in deeper CNN layers. Kernel similarity is quantified with cosine similarity, $sim(\mathbf{k}_i,\mathbf{k}_j) = \frac{\mathbf{k}_i^\top \mathbf{k}_j}{\|\mathbf{k}_i\| \|\mathbf{k}_j\|}$, and merging is performed via interpolation $\mathbf{K}_i \leftarrow \alpha \mathbf{K}_i + (1-\alpha) \mathbf{K}_j$ when $sim(\mathbf{k}_i,\mathbf{k}_j) > \tau$, with merging occurring stochastically at probability $p$ and excluding the first $L_s$ shallow layers. The method requires no extra inference cost and shows improved AUROC/accuracy across ChestXRay14 and MedMNIST, with ablations revealing the importance of deep-layer merging and hyperparameter choices. The findings suggest a practical regularization mechanism for robust medical imaging models and point toward future extensions to transformer-based architectures.

Abstract

Model merging is an effective strategy to merge multiple models for enhancing model performances, and more efficient than ensemble learning as it will not introduce extra computation into inference. However, limited research explores if the merging process can occur within one model and enhance the model's robustness, which is particularly critical in the medical image domain. In the paper, we are the first to propose in-model merging (InMerge), a novel approach that enhances the model's robustness by selectively merging similar convolutional kernels in the deep layers of a single convolutional neural network (CNN) during the training process for classification. We also analytically reveal important characteristics that affect how in-model merging should be performed, serving as an insightful reference for the community. We demonstrate the feasibility and effectiveness of this technique for different CNN architectures on 4 prevalent datasets. The proposed InMerge-trained model surpasses the typically-trained model by a substantial margin. The code will be made public.

In-Model Merging for Enhancing the Robustness of Medical Imaging Classification Models

TL;DR

This work tackles robustness in medical imaging classification by introducing InMerge, a single-model, finetuning-based kernel-merging strategy that reduces intra-model kernel redundancy in deeper CNN layers. Kernel similarity is quantified with cosine similarity, , and merging is performed via interpolation when , with merging occurring stochastically at probability and excluding the first shallow layers. The method requires no extra inference cost and shows improved AUROC/accuracy across ChestXRay14 and MedMNIST, with ablations revealing the importance of deep-layer merging and hyperparameter choices. The findings suggest a practical regularization mechanism for robust medical imaging models and point toward future extensions to transformer-based architectures.

Abstract

Model merging is an effective strategy to merge multiple models for enhancing model performances, and more efficient than ensemble learning as it will not introduce extra computation into inference. However, limited research explores if the merging process can occur within one model and enhance the model's robustness, which is particularly critical in the medical image domain. In the paper, we are the first to propose in-model merging (InMerge), a novel approach that enhances the model's robustness by selectively merging similar convolutional kernels in the deep layers of a single convolutional neural network (CNN) during the training process for classification. We also analytically reveal important characteristics that affect how in-model merging should be performed, serving as an insightful reference for the community. We demonstrate the feasibility and effectiveness of this technique for different CNN architectures on 4 prevalent datasets. The proposed InMerge-trained model surpasses the typically-trained model by a substantial margin. The code will be made public.

Paper Structure

This paper contains 18 sections, 6 equations, 3 figures, 2 tables, 1 algorithm.

Figures (3)

  • Figure 1: An example showing the similarity differences between patterns stored in kernels in a well-trained ResNet34 layer. The more similar pair of kernels would incur larger similarity. Kernel 9 and Kernel 10 are similar, the similarity $\text{sim}(\mathbf{k}_9, \mathbf{k}_{10})=0.9372$; Kernel 3 and Kernel 11 nearly have orthogonal textures, so $\text{sim}(\mathbf{k}_3, \mathbf{k}_{11})=-0.0234$; Kernel 4 and Kernel 12 have very different textures and colors, $\text{sim}(\mathbf{k}_4, \mathbf{k}_{12})=-0.5049$; Kernel 5 and Kernel 13 are somewhat similar, $\text{sim}(\mathbf{k}_5, \mathbf{k}_{13})=0.4562.$
  • Figure 2: Merging with different merge weights and merge probabilities
  • Figure 3: (a) The sensitivity of different merged layers; (b) Merging with different similarities.