Table of Contents
Fetching ...

Combiner and HyperCombiner Networks: Rules to Combine Multimodality MR Images for Prostate Cancer Localisation

Wen Yan, Bernard Chiu, Ziyi Shen, Qianye Yang, Tom Syer, Zhe Min, Shonit Punwani, Mark Emberton, David Atkinson, Dean C. Barratt, Yipeng Hu

TL;DR

The paper addresses localising clinically significant prostate cancer from multiparametric MRI by modelling radiologist-style modality fusion rules. It introduces Combiner and HyperCombiner networks that use linear ($Z=\sum_{\tau} \alpha_{\tau} Y^{\tau}$) and nonlinear stacking ($Z=\sigma(\sum_{\tau} \beta_{\tau}Y^{\tau}+\beta_{0})$) formulations to weight modality-specific predictions, and a hypernetwork $\tilde{\boldsymbol{\theta}}=h(\boldsymbol{\alpha};\boldsymbol{\phi})$ to enable inference-time rule conditioning. The approach facilitates rule discovery and interpretation, including PI-RADS-based encoding of conditions with zone-specific decisions and hyperparameter-guided grid searches, while maintaining competitive segmentation performance on a sizeable clinical mpMR dataset (651 patients with all three modalities; 751 total cases; 500/124/127 train/val/test). HyperCombiner networks provide efficient exploration of alternative combining rules and quantify modality importance via odds ratios and statistical measures, offering practical insights for modality availability and decision rule optimization in clinical workflows. Overall, the work demonstrates that low-dimensional, interpretable rule models can match end-to-end performance while increasing transparency and enabling rule discovery in multimodal prostate cancer localisation.

Abstract

One of the distinct characteristics in radiologists' reading of multiparametric prostate MR scans, using reporting systems such as PI-RADS v2.1, is to score individual types of MR modalities, T2-weighted, diffusion-weighted, and dynamic contrast-enhanced, and then combine these image-modality-specific scores using standardised decision rules to predict the likelihood of clinically significant cancer. This work aims to demonstrate that it is feasible for low-dimensional parametric models to model such decision rules in the proposed Combiner networks, without compromising the accuracy of predicting radiologic labels: First, it is shown that either a linear mixture model or a nonlinear stacking model is sufficient to model PI-RADS decision rules for localising prostate cancer. Second, parameters of these (generalised) linear models are proposed as hyperparameters, to weigh multiple networks that independently represent individual image modalities in the Combiner network training, as opposed to end-to-end modality ensemble. A HyperCombiner network is developed to train a single image segmentation network that can be conditioned on these hyperparameters during inference, for much improved efficiency. Experimental results based on data from 850 patients, for the application of automating radiologist labelling multi-parametric MR, compare the proposed combiner networks with other commonly-adopted end-to-end networks. Using the added advantages of obtaining and interpreting the modality combining rules, in terms of the linear weights or odds-ratios on individual image modalities, three clinical applications are presented for prostate cancer segmentation, including modality availability assessment, importance quantification and rule discovery.

Combiner and HyperCombiner Networks: Rules to Combine Multimodality MR Images for Prostate Cancer Localisation

TL;DR

The paper addresses localising clinically significant prostate cancer from multiparametric MRI by modelling radiologist-style modality fusion rules. It introduces Combiner and HyperCombiner networks that use linear () and nonlinear stacking () formulations to weight modality-specific predictions, and a hypernetwork to enable inference-time rule conditioning. The approach facilitates rule discovery and interpretation, including PI-RADS-based encoding of conditions with zone-specific decisions and hyperparameter-guided grid searches, while maintaining competitive segmentation performance on a sizeable clinical mpMR dataset (651 patients with all three modalities; 751 total cases; 500/124/127 train/val/test). HyperCombiner networks provide efficient exploration of alternative combining rules and quantify modality importance via odds ratios and statistical measures, offering practical insights for modality availability and decision rule optimization in clinical workflows. Overall, the work demonstrates that low-dimensional, interpretable rule models can match end-to-end performance while increasing transparency and enabling rule discovery in multimodal prostate cancer localisation.

Abstract

One of the distinct characteristics in radiologists' reading of multiparametric prostate MR scans, using reporting systems such as PI-RADS v2.1, is to score individual types of MR modalities, T2-weighted, diffusion-weighted, and dynamic contrast-enhanced, and then combine these image-modality-specific scores using standardised decision rules to predict the likelihood of clinically significant cancer. This work aims to demonstrate that it is feasible for low-dimensional parametric models to model such decision rules in the proposed Combiner networks, without compromising the accuracy of predicting radiologic labels: First, it is shown that either a linear mixture model or a nonlinear stacking model is sufficient to model PI-RADS decision rules for localising prostate cancer. Second, parameters of these (generalised) linear models are proposed as hyperparameters, to weigh multiple networks that independently represent individual image modalities in the Combiner network training, as opposed to end-to-end modality ensemble. A HyperCombiner network is developed to train a single image segmentation network that can be conditioned on these hyperparameters during inference, for much improved efficiency. Experimental results based on data from 850 patients, for the application of automating radiologist labelling multi-parametric MR, compare the proposed combiner networks with other commonly-adopted end-to-end networks. Using the added advantages of obtaining and interpreting the modality combining rules, in terms of the linear weights or odds-ratios on individual image modalities, three clinical applications are presented for prostate cancer segmentation, including modality availability assessment, importance quantification and rule discovery.
Paper Structure (49 sections, 21 equations, 6 figures, 7 tables, 1 algorithm)

This paper contains 49 sections, 21 equations, 6 figures, 7 tables, 1 algorithm.

Figures (6)

  • Figure 1: The illustration of PI-RADS v2 scoring system using a scale of 1--5 (left) and an example of the derived binary classification system used in this study (right).
  • Figure 2: Graphical models from (a) to (c) use a single (shared) network with parameters $\uptheta$, where $\mathbf{y}^1$, $\mathbf{y}^2$ and $\mathbf{y}$ are intermediate features and $\upalpha$ is a set of parameters, to combine the inputs $\mathbf{x}^1$ and $\mathbf{x}^2$ and output $\mathbf{z}$. (a) an example of late combiner models, of interest in this study, with observed $\upalpha$ in shaded circle, (b) a late fusion in combining methods with learnable $\upalpha$, (c) an early fusion model for comparison. Hollow circles and solid dots indicate random variables and deterministic parameters, respectively. This is one example to show the difference between the combiner models (a) and other model combining methods (b and c), among other possible probabilistic graphical representations. Besides, (d) and (e) are schematic diagrams of the proposed Combiner and HyperCombiner, respectively, with three image modalities as example inputs.
  • Figure 3: Illustration of the proposed Combiner (left) and HyperCombiner networks (right), which receive three types of images, T2W, DWI$_{hd}$ and ADC, as inputs. The hyperparameters $\mathbf\upalpha$ are generated conditionally on certain combination rules (b) and are used as combination parameters to combine the outputs of the three images in training. The red parameters in Combiner and HyperCombiner are trainable weights, while all weights $\mathbf\uptheta$ in HyperCombiner are non-trainable and are generated by an auxiliary hypernetwork $\tilde{\mathbf\uptheta}=h(\mathbf\upalpha;\mathbf\upphi)$.
  • Figure 4: The first and second row of figures compared the DSC between Combiner and HyperCombiner based on UNet backbone in terms of the linear mixture and nonlinear stacking models, respectively. Note that $x$ coordinate was the $No._{Rule}$ of randomly selected combinations of hyperparameters.
  • Figure 5: Grid search results on validation dataset of linear mixture learning heat map with various combinations of hyperparameters ($\alpha_1, \alpha_2, \alpha_3$), where $\alpha_1+\alpha_2+ \alpha_3=1$. Thus, we just show heat map of $\alpha_1$ and $\alpha_2$, and corresponding $\alpha_3= 1-\alpha_2- \alpha_1$.
  • ...and 1 more figures