Table of Contents
Fetching ...

Is network fragmentation a useful complexity measure?

Coenraad Mouton, Randle Rabe, Daniël G. Haasbroek, Marthinus W. Theunissen, Hermanus L. Potgieter, Marelie H. Davel

TL;DR

Using a fragmentation-based complexity measure, this work shows that fragmentation is a phenomenon worth investigating further when studying the generalization ability of deep neural networks, and reports on new observations related to fragmentation.

Abstract

It has been observed that the input space of deep neural network classifiers can exhibit `fragmentation', where the model function rapidly changes class as the input space is traversed. The severity of this fragmentation tends to follow the double descent curve, achieving a maximum at the interpolation regime. We study this phenomenon in the context of image classification and ask whether fragmentation could be predictive of generalization performance. Using a fragmentation-based complexity measure, we show this to be possible by achieving good performance on the PGDL (Predicting Generalization in Deep Learning) benchmark. In addition, we report on new observations related to fragmentation, namely (i) fragmentation is not limited to the input space but occurs in the hidden representations as well, (ii) fragmentation follows the trends in the validation error throughout training, and (iii) fragmentation is not a direct result of increased weight norms. Together, this indicates that fragmentation is a phenomenon worth investigating further when studying the generalization ability of deep neural networks.

Is network fragmentation a useful complexity measure?

TL;DR

Using a fragmentation-based complexity measure, this work shows that fragmentation is a phenomenon worth investigating further when studying the generalization ability of deep neural networks, and reports on new observations related to fragmentation.

Abstract

It has been observed that the input space of deep neural network classifiers can exhibit `fragmentation', where the model function rapidly changes class as the input space is traversed. The severity of this fragmentation tends to follow the double descent curve, achieving a maximum at the interpolation regime. We study this phenomenon in the context of image classification and ask whether fragmentation could be predictive of generalization performance. Using a fragmentation-based complexity measure, we show this to be possible by achieving good performance on the PGDL (Predicting Generalization in Deep Learning) benchmark. In addition, we report on new observations related to fragmentation, namely (i) fragmentation is not limited to the input space but occurs in the hidden representations as well, (ii) fragmentation follows the trends in the validation error throughout training, and (iii) fragmentation is not a direct result of increased weight norms. Together, this indicates that fragmentation is a phenomenon worth investigating further when studying the generalization ability of deep neural networks.

Paper Structure

This paper contains 16 sections, 2 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Error versus model capacity for models trained on partially label-corrupted data (red) and clean data (green). Left: Test error. Right: Train error.
  • Figure 2: Mean fragmentation versus model capacity. Left: Input space. Right: Hidden space fragmentation for the first convolutional layer. 'Clean' and 'corrupt' refers to models trained on clean or partially label-corrupted data, respectively. Results are averaged over three seeds. The shaded regions indicate the error (standard deviation).
  • Figure 3: Mean fragmentation per convolutional layer versus model capacity. Channels per layer are given by the pattern $[k,\ 2k,\ 4k,\ 8k]$ where $k\in \{4,...,64\}$. First Row: Fragmentation in the hidden space representations for convolutional layers $1$ and $2$, respectively. Second Row: Fragmentation in the hidden space representations for convolutional layers $3$ and $4$, respectively. Results are averaged over three seeds. The shaded regions indicate the error (standard deviation).
  • Figure 4: Mean fragmentation as a function of depth. Left: Depth-wise fragmentation on clean CIFAR10. Right: Depth-wise fragmentation on corrupted CIFAR10. Results are average over three seeds. The shaded regions indicate the error (standard deviation). The legend indicates the number of channels in each layer.
  • Figure 5: Mean fragmentation throughout training. The dashed line marks the end of the first epoch. Results are averaged over three seeds. The shaded regions indicate the error (standard deviation).
  • ...and 1 more figures