Table of Contents
Fetching ...

Upright adjustment with graph convolutional networks

Raehyuk Jung, Sungmin Cho, Junseok Kwon

TL;DR

This work tackles upright adjustment of 360° images by processing data on its natural spherical domain. It introduces a CNN+GCN framework that maps CNN-derived feature maps to a spherical graph and uses a GCN to predict a discrete North-pole distribution, from which the upright rotation is obtained. A novel training objective combines distribution labels generated via von Mises-Fisher statistics with Jensen-Shannon divergence, yielding improved rotation invariance, faster convergence, and superior accuracy on SUN360 compared to projection-based CNNs and horizon-based methods. By operating directly on the sphere, the approach preserves the data's geometric structure and enhances VR viewing stability without relying on 2D projections.

Abstract

We present a novel method for the upright adjustment of 360 images. Our network consists of two modules, which are a convolutional neural network (CNN) and a graph convolutional network (GCN). The input 360 images is processed with the CNN for visual feature extraction, and the extracted feature map is converted into a graph that finds a spherical representation of the input. We also introduce a novel loss function to address the issue of discrete probability distributions defined on the surface of a sphere. Experimental results demonstrate that our method outperforms fully connected based methods.

Upright adjustment with graph convolutional networks

TL;DR

This work tackles upright adjustment of 360° images by processing data on its natural spherical domain. It introduces a CNN+GCN framework that maps CNN-derived feature maps to a spherical graph and uses a GCN to predict a discrete North-pole distribution, from which the upright rotation is obtained. A novel training objective combines distribution labels generated via von Mises-Fisher statistics with Jensen-Shannon divergence, yielding improved rotation invariance, faster convergence, and superior accuracy on SUN360 compared to projection-based CNNs and horizon-based methods. By operating directly on the sphere, the approach preserves the data's geometric structure and enhances VR viewing stability without relying on 2D projections.

Abstract

We present a novel method for the upright adjustment of 360 images. Our network consists of two modules, which are a convolutional neural network (CNN) and a graph convolutional network (GCN). The input 360 images is processed with the CNN for visual feature extraction, and the extracted feature map is converted into a graph that finds a spherical representation of the input. We also introduce a novel loss function to address the issue of discrete probability distributions defined on the surface of a sphere. Experimental results demonstrate that our method outperforms fully connected based methods.
Paper Structure (16 sections, 3 equations, 5 figures, 2 tables)

This paper contains 16 sections, 3 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Upright adjustment consisting of two steps. The first step is to estimate a North pole. Once the North pole is estimated, a rotation matrix $R$ that can map the estimated North pole to $(0,0,1)$ is left-multiplied to the input image. The left and right images represent the input and output, respectively.
  • Figure 2: Illustration of the network forwarding process. Orange box explains how the feature map is converted into the graph, wherein the color of the feature map and the nodes of the graph represent the correspondence between the feature map and the points (i.e., nodes) on the sphere. After being mapped to the sphere, the nodes are connected to the 6-nearest neighbors to form the graph. Red box illustrates the concept of JSD loss.
  • Figure 3: Von Mises-Fisher distributions where $\mu$ is the unit vector heading toward us with different $\kappa$
  • Figure 4: Qualitative Comparison. In the first row, horizon based method failed because a clearly visible horizon was not detected. In the second row, both horizon based method and Junget al. failed. In the third row, horizon based method was successful because of a clear horizon, whereas Jung et al. failed. In contrast, ours was successful for all the cases and handled various scenarios (e.g., nature/urban, indoor/outdoor, and existence of horizon or not).
  • Figure 5: Advantages of the GCN module.