Table of Contents
Fetching ...

K-Origins: Better Colour Quantification for Neural Networks

Lewis Mason, Mark Martinez

TL;DR

K-Origins improves semantic segmentation accuracy in two scenarios: object detection with low signal-to-noise ratios, and segmenting multiple objects that are identical in shape but vary in colour.

Abstract

K-Origins is a neural network layer designed to improve image-based network performances when learning colour, or intensities, is beneficial. Over 250 encoder-decoder convolutional networks are trained and tested on 16-bit synthetic data, demonstrating that K-Origins improves semantic segmentation accuracy in two scenarios: object detection with low signal-to-noise ratios, and segmenting multiple objects that are identical in shape but vary in colour. K-Origins generates output features from the input features, $\textbf{X}$, by the equation $\textbf{Y}_k = \textbf{X}-\textbf{J}\cdot w_k$ for each trainable parameter $w_k$, where $\textbf{J}$ is a matrix of ones. Additionally, networks with varying receptive fields were trained to determine optimal network depths based on the dimensions of target classes, suggesting that receptive field lengths should exceed object sizes. By ensuring a sufficient receptive field length and incorporating K-Origins, we can achieve better semantic network performance.

K-Origins: Better Colour Quantification for Neural Networks

TL;DR

K-Origins improves semantic segmentation accuracy in two scenarios: object detection with low signal-to-noise ratios, and segmenting multiple objects that are identical in shape but vary in colour.

Abstract

K-Origins is a neural network layer designed to improve image-based network performances when learning colour, or intensities, is beneficial. Over 250 encoder-decoder convolutional networks are trained and tested on 16-bit synthetic data, demonstrating that K-Origins improves semantic segmentation accuracy in two scenarios: object detection with low signal-to-noise ratios, and segmenting multiple objects that are identical in shape but vary in colour. K-Origins generates output features from the input features, , by the equation for each trainable parameter , where is a matrix of ones. Additionally, networks with varying receptive fields were trained to determine optimal network depths based on the dimensions of target classes, suggesting that receptive field lengths should exceed object sizes. By ensuring a sufficient receptive field length and incorporating K-Origins, we can achieve better semantic network performance.
Paper Structure (15 sections, 5 equations, 13 figures, 11 tables)

This paper contains 15 sections, 5 equations, 13 figures, 11 tables.

Figures (13)

  • Figure 1: Two examples of synthetic data segmentation tasks, demonstrating performance improvements from K-Origins. Each colour in the ground truth represents a different class. The neural networks used are discussed later, and only differ by the inclusion of K-Origins. (a) Segmenting multiple noisy grayscale classes from a noisy background after 20 epochs. (b) The "tracer" problem in colour, segmenting nearly identical classes (the largest circles with different greyscale values in the ground truth) with slight variations in colour distributions after 30 epochs.
  • Figure 2: (a) Mapping from 16-bit pixel integer values to grayscale intensity. (b) Example of synthetic data used in this paper with corresponding histograms for both noise-free and Gaussian noise cases.
  • Figure 3: Small encoder-decoder network predicting synthetic data with square side lengths ranging from 5 to 20 pixels. (a) Network architecture. (b) A 67% validation accuracy at steady state after 77 epochs with a learning rate of 1E-3. This shows input data, the histogram, the ground truth, and the network prediction. The network struggles with colour magnitudes, correctly classifying only up to 4-5 pixels from the object borders, indicating reliance on colour gradients rather than colour magnitudes for classification.
  • Figure 4: (a) Very small colour network architecture and additional parameters. (b) Colour network results on the motivating case that the smaller encoder-decoder network failed to solve, the motivating case with additional classes, and a failure case with introduced noise, where the Hellinger distance is no longer unity between classes.
  • Figure 5: Set of neural networks used for receptive field length tests with and without K-Origins. Networks RFL8, RFL18, and RFL38 are miniature U-Net based architectures, differing primarily in their receptive field length. Networks KRFL8, KRFL18, and KRFL38 are identical but include K-Origins at every depth.
  • ...and 8 more figures