Table of Contents
Fetching ...

Ensemble Learning with Sparse Hypercolumns

Julia Dietlmeier, Vayangi Ganepola, Oluwabukola G. Adegboro, Mayug Maniparambil, Claudia Mazo, Noel E. O'Connor

TL;DR

This work addresses the challenge of computational complexity of processing concatenated dense hypercolumns by applying stratified subsampling to the VGG16 based hypercolumns, and investigates the performance of ensemble learning on sparse hypercolumns.

Abstract

Directly inspired by findings in biological vision, high-dimensional hypercolumns are feature vectors built by concatenating multi-scale activations of convolutional neural networks for a single image pixel location. Together with powerful classifiers, they can be used for image segmentation i.e. pixel classification. However, in practice, there are only very few works dedicated to the use of hypercolumns. One reason is the computational complexity of processing concatenated dense hypercolumns that grows linearly with the size $N$ of the training set. In this work, we address this challenge by applying stratified subsampling to the VGG16 based hypercolumns. Furthermore, we investigate the performance of ensemble learning on sparse hypercolumns. Our experiments on a brain tumor dataset show that stacking and voting ensembles deliver competitive performance, but in the extreme low-shot case of $N \leq 20$, a simple Logistic Regression classifier is the most effective method. For 10% stratified subsampling rate, our best average Dice score is 0.66 for $N=20$. This is a statistically significant improvement of 24.53% over the standard multi-scale UNet baseline ($p$-value = $[3.07e-11]$, Wilcoxon signed-rank test), which is less effective due to overfitting.

Ensemble Learning with Sparse Hypercolumns

TL;DR

This work addresses the challenge of computational complexity of processing concatenated dense hypercolumns by applying stratified subsampling to the VGG16 based hypercolumns, and investigates the performance of ensemble learning on sparse hypercolumns.

Abstract

Directly inspired by findings in biological vision, high-dimensional hypercolumns are feature vectors built by concatenating multi-scale activations of convolutional neural networks for a single image pixel location. Together with powerful classifiers, they can be used for image segmentation i.e. pixel classification. However, in practice, there are only very few works dedicated to the use of hypercolumns. One reason is the computational complexity of processing concatenated dense hypercolumns that grows linearly with the size of the training set. In this work, we address this challenge by applying stratified subsampling to the VGG16 based hypercolumns. Furthermore, we investigate the performance of ensemble learning on sparse hypercolumns. Our experiments on a brain tumor dataset show that stacking and voting ensembles deliver competitive performance, but in the extreme low-shot case of , a simple Logistic Regression classifier is the most effective method. For 10% stratified subsampling rate, our best average Dice score is 0.66 for . This is a statistically significant improvement of 24.53% over the standard multi-scale UNet baseline (-value = , Wilcoxon signed-rank test), which is less effective due to overfitting.
Paper Structure (11 sections, 4 figures, 5 tables)

This paper contains 11 sections, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Our hypercolumn-based processing pipeline for binary pixel classification.
  • Figure 2: Dice score coefficient performance of LR versus UNet (left) and stacking versus voting (right). All models were trained on the same amount of images for 1% stratified subsampling rate. Graphs are showing mean (lines) $\pm$ standard deviation (filled areas).
  • Figure 3: Dice score coefficient performance of LR versus UNet (left) and stacking versus voting (right). All models were trained on the same amount of images for 10% stratified subsampling rate. Graphs are showing mean (lines) $\pm$ standard deviation (filled areas).
  • Figure 4: Selected qualitative results for $N=10$ training images. The first three rows show results for 1% subsampling rate and the last three rows show the results for 10% subsampling rate. Note that UNet is trained on $N=10$ images without subsampling.