Table of Contents
Fetching ...

Chanel-Orderer: A Channel-Ordering Predictor for Tri-Channel Natural Images

Shen Li, Lei Jiang, Wei Wang, Hongwei Hu, Liang Li

Abstract

This paper shows a proof-of-concept that, given a typical 3-channel images but in a randomly permuted channel order, a model (termed as Chanel-Orderer) with ad-hoc inductive biases in terms of both architecture and loss functions can accurately predict the channel ordering and knows how to make it right. Specifically, Chanel-Orderer learns to score each of the three channels with the priors of object semantics and uses the resulting scores to predict the channel ordering. This brings up benefits into a typical scenario where an \texttt{RGB} image is often mis-displayed in the \texttt{BGR} format and needs to be corrected into the right order. Furthermore, as a byproduct, the resulting model Chanel-Orderer is able to tell whether a given image is a near-gray-scale image (near-monochromatic) or not (polychromatic). Our research suggests that Chanel-Orderer mimics human visual coloring of our physical natural world.

Chanel-Orderer: A Channel-Ordering Predictor for Tri-Channel Natural Images

Abstract

This paper shows a proof-of-concept that, given a typical 3-channel images but in a randomly permuted channel order, a model (termed as Chanel-Orderer) with ad-hoc inductive biases in terms of both architecture and loss functions can accurately predict the channel ordering and knows how to make it right. Specifically, Chanel-Orderer learns to score each of the three channels with the priors of object semantics and uses the resulting scores to predict the channel ordering. This brings up benefits into a typical scenario where an \texttt{RGB} image is often mis-displayed in the \texttt{BGR} format and needs to be corrected into the right order. Furthermore, as a byproduct, the resulting model Chanel-Orderer is able to tell whether a given image is a near-gray-scale image (near-monochromatic) or not (polychromatic). Our research suggests that Chanel-Orderer mimics human visual coloring of our physical natural world.

Paper Structure

This paper contains 24 sections, 1 theorem, 13 equations, 5 figures, 3 tables.

Key Result

Theorem 2.1

Suppose the function $g$ is a monotonically increasing differentiable function. The loss function $\mathcal{L}(s, y)$ is an increasing function with regards to the score difference $\Delta_{ij}$ when $I_i \prec I_j$ and a decreasing function with regards to $\Delta_{ij}$ when $I_i \succ I_j$, i.e.:

Figures (5)

  • Figure 1: We show a proof-of-concept that, given a typical 3-channel images but in a permuted channel order, our proposed model Chanel-Orderer with ad-hoc inductive biases can accurately predict the channel ordering. Note that an alternative straightforward workaround of this problem is to cast it into a classification problem which covers $3!=6$ categories: RGB, RBG, GRB, GBR, BRG and BGR and to train a softmax classifier for predictions. However, softmax classifiers lack necessary inductive biases and are inferior to the proposed Chanel-Orderer according to our empirical findings.
  • Figure 2: Architecture of the scoring function $f_\theta$. Given a tri-channel image $\mathcal{I}$, Chanel-Orderer first unpacks it into three channels, $I_1$, $I_2$ and $I_3$. Then, these three channels are separately and independently sent into a U-Net, which yields three feature maps $F_1$, $F_2$ and $F_3$. For each feature map $F_i$, segmentation masks $M^1, ..., M^N$ are applied to it (element-wise multiplication $\otimes$) followed by a mean pooling operation which yields the color representation for each semantic object $c_i^n$, for $n=1,...,N$. We concatenate them as a vector $c_i := [c_i^1, ..., c_i^N]^T$. The general prior weight for each object is $\alpha:=[\alpha^1, ..., \alpha^N]^T$. Then the final score $s_i$ is given by the inner product between $c_i$ and $\alpha$:, $s_i = \alpha^T c_i$.
  • Figure 3: Examples of near-grayscale images. Near-grayscale images, which often appear in posters or advertisements, are mostly photographed for aesthetic purpose: photographers who make such images use polychromatic imagery to highlight the objects in the images and use monochromatic imagery to render the rest.
  • Figure 4: Detection of near-grayscale images. (a) Results of Chanel-Orderer and the distribution of $\max_{i,j}{|\Delta_{ij}|}$. The threshold $\tau$ is set to $0.4$. (b) Results of Softmax Model and the distribution of $H[p]$. The threshold is set to $1.79$.
  • Figure :

Theorems & Definitions (2)

  • Theorem 2.1
  • proof