Table of Contents
Fetching ...

Semi-Supervised 360 Layout Estimation with Panoramic Collaborative Perturbations

Junsong Zhang, Chunyu Lin, Zhijie Shen, Lang Nie, Kang Liao, Yao Zhao

TL;DR

SemiLayout360 tackles the challenge of limited annotations in 360-degree indoor layout estimation by embedding panoramic priors into perturbations within a Mean-Teacher semi-supervised framework. It introduces panoramic layout priors to sharpen boundaries and distortion priors to model non-uniform distortion, then couples them into panoramic collaborative perturbations that balance effectiveness with convergence stability. Building on DOPNet, the method uses image, feature, and network perturbations, along with a ramped consistency loss that leverages unlabeled data, achieving state-of-the-art results on PanoContext, Stanford2D3D, and MatterportLayout with limited labels. The approach demonstrates that task-specific priors in perturbations can substantially boost semi-supervised 360 layout estimation, reducing annotation costs while improving accuracy and robustness in both cuboid and non-cuboid room layouts.

Abstract

The performance of existing supervised layout estimation methods heavily relies on the quality of data annotations. However, obtaining large-scale and high-quality datasets remains a laborious and time-consuming challenge. To solve this problem, semi-supervised approaches are introduced to relieve the demand for expensive data annotations by encouraging the consistent results of unlabeled data with different perturbations. However, existing solutions merely employ vanilla perturbations, ignoring the characteristics of panoramic layout estimation. In contrast, we propose a novel semi-supervised method named SemiLayout360, which incorporates the priors of the panoramic layout and distortion through collaborative perturbations. Specifically, we leverage the panoramic layout prior to enhance the model's focus on potential layout boundaries. Meanwhile, we introduce the panoramic distortion prior to strengthen distortion awareness. Furthermore, to prevent intense perturbations from hindering model convergence and ensure the effectiveness of prior-based perturbations, we divide and reorganize them as panoramic collaborative perturbations. Our experimental results on three mainstream benchmarks demonstrate that the proposed method offers significant advantages over existing state-of-the-art (SoTA) solutions.

Semi-Supervised 360 Layout Estimation with Panoramic Collaborative Perturbations

TL;DR

SemiLayout360 tackles the challenge of limited annotations in 360-degree indoor layout estimation by embedding panoramic priors into perturbations within a Mean-Teacher semi-supervised framework. It introduces panoramic layout priors to sharpen boundaries and distortion priors to model non-uniform distortion, then couples them into panoramic collaborative perturbations that balance effectiveness with convergence stability. Building on DOPNet, the method uses image, feature, and network perturbations, along with a ramped consistency loss that leverages unlabeled data, achieving state-of-the-art results on PanoContext, Stanford2D3D, and MatterportLayout with limited labels. The approach demonstrates that task-specific priors in perturbations can substantially boost semi-supervised 360 layout estimation, reducing annotation costs while improving accuracy and robustness in both cuboid and non-cuboid room layouts.

Abstract

The performance of existing supervised layout estimation methods heavily relies on the quality of data annotations. However, obtaining large-scale and high-quality datasets remains a laborious and time-consuming challenge. To solve this problem, semi-supervised approaches are introduced to relieve the demand for expensive data annotations by encouraging the consistent results of unlabeled data with different perturbations. However, existing solutions merely employ vanilla perturbations, ignoring the characteristics of panoramic layout estimation. In contrast, we propose a novel semi-supervised method named SemiLayout360, which incorporates the priors of the panoramic layout and distortion through collaborative perturbations. Specifically, we leverage the panoramic layout prior to enhance the model's focus on potential layout boundaries. Meanwhile, we introduce the panoramic distortion prior to strengthen distortion awareness. Furthermore, to prevent intense perturbations from hindering model convergence and ensure the effectiveness of prior-based perturbations, we divide and reorganize them as panoramic collaborative perturbations. Our experimental results on three mainstream benchmarks demonstrate that the proposed method offers significant advantages over existing state-of-the-art (SoTA) solutions.

Paper Structure

This paper contains 26 sections, 7 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Brief comparisons between the previous method and our method: (a) SSLayout360 r14, based on consistency regularization, applies vanilla perturbations (e.g., stretch, flip, rotate, gamma correction) at the image level. (b) We integrate panoramic layout and distortion priors into the perturbations and refine them into panoramic collaborative perturbations, which enables prior-based perturbations to complement each other, significantly improving the performance of semi-supervised panoramic layout estimation.
  • Figure 2: Overview of the framework of SemiLayout360. In the standard teacher-student framework, SemiLayout360 trains the student model S(parameterized by $\theta ^{S}$ ) on both labeled data $\left ( x_{l},y_{l} \right )$ and unlabeled data $x_{u}$, by minimizing the corresponding supervised loss $L_{sup}$ and unsupervised consistency loss $L_{con}$. The teacher model (parameterized by $\theta ^{T}$ ) is updated via the exponential moving average (EMA) of the student model's parameters and generates pseudo-labels $Z_{tea}$ for the unlabeled data. The core of SemiLayout360 is to apply multiple perturbations on the unlabeled samples, including image, feature, and network perturbations.
  • Figure 3: Qualitative results on the PanoContext dataset (top), Stanford2D3D dataset (middle), and MatterportLayout dataset (bottom). We compare our SemiLayout360 with the supervised DOPNet and SSLayout360. The supervised DOPNet is trained on 100 labels, while our SemiLayout360 and SSLayout360 use the same 100 labels along with unlabeled images. The boundaries of the room layout on a panorama are shown on the left and the floor plan is on the right. Ground truth is viewed in Green lines and the prediction in Red. The predicted horizon depth, normal, and gradient are visualized below each panorama. We observe that SemiLayout360 predicts layout boundary lines following more closely to the ground truth than DOPNet and SSLayout360,which demonstrates the effectiveness of applying customized image and feature perturbation strategies.
  • Figure 4: Qualitative comparisons about individual perturbation on the PanoContext dataset. As we add image and feature perturbations from left to right collaboratively, the boundaries of the room layout become more accurate. Ground truth is viewed in Green lines and the prediction in Red