Table of Contents
Fetching ...

How and why does deep ensemble coupled with transfer learning increase performance in bipolar disorder and schizophrenia classification?

Sara Petiton, Antoine Grigis, Benoit Dufumier, Edouard Duchesnay

Abstract

Transfer learning (TL) and deep ensemble learning (DE) have recently been shown to outperform simple machine learning in classifying psychiatric disorders. However, there is still a lack of understanding as to why that is. This paper aims to understand how and why DE and TL reduce the variability of single-subject classification models in bipolar disorder (BD) and schizophrenia (SCZ). To this end, we investigated the training stability of TL and DE models. For the two classification tasks under consideration, we compared the results of multiple trainings with the same backbone but with different initializations. In this way, we take into account the epistemic uncertainty associated with the uncertainty in the estimation of the model parameters. It has been shown that the performance of classifiers can be significantly improved by using TL with DE. Based on these results, we investigate i) how many models are needed to benefit from the performance improvement of DE when classifying BD and SCZ from healthy controls, and ii) how TL induces better generalization, with and without DE. In the first case, we show that DE reaches a plateau when 10 models are included in the ensemble. In the second case, we find that using a pre-trained model constrains TL models with the same pre-training to stay in the same basin of the loss function. This is not the case for DL models with randomly initialized weights.

How and why does deep ensemble coupled with transfer learning increase performance in bipolar disorder and schizophrenia classification?

Abstract

Transfer learning (TL) and deep ensemble learning (DE) have recently been shown to outperform simple machine learning in classifying psychiatric disorders. However, there is still a lack of understanding as to why that is. This paper aims to understand how and why DE and TL reduce the variability of single-subject classification models in bipolar disorder (BD) and schizophrenia (SCZ). To this end, we investigated the training stability of TL and DE models. For the two classification tasks under consideration, we compared the results of multiple trainings with the same backbone but with different initializations. In this way, we take into account the epistemic uncertainty associated with the uncertainty in the estimation of the model parameters. It has been shown that the performance of classifiers can be significantly improved by using TL with DE. Based on these results, we investigate i) how many models are needed to benefit from the performance improvement of DE when classifying BD and SCZ from healthy controls, and ii) how TL induces better generalization, with and without DE. In the first case, we show that DE reaches a plateau when 10 models are included in the ensemble. In the second case, we find that using a pre-trained model constrains TL models with the same pre-training to stay in the same basin of the loss function. This is not the case for DL models with randomly initialized weights.

Paper Structure

This paper contains 13 sections, 2 equations, 2 figures, 1 table.

Figures (2)

  • Figure 1: Learning curves obtained by monitoring the ROC-AUC performance of BD classification as a function of the number of models T considered in the deep ensemble (DE) strategy. The obtained standard deviations are shown directly in the figure for each $T$-DE value examined on the x-axis. The "x=no-DE" configurations correspond to the means and standard deviations of the 90 trained models without DE.
  • Figure 2: Linear interpolation between RI-DL and TL models at the last and best training epochs on both BD and SCZ datasets. $\lambda\in [0,1]$ is the linear interpolation coefficient (see Eq. \ref{['eq:LI']}).