Table of Contents
Fetching ...

TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight

Hyun-Kurl Jang, Jihun Kim, Hyeokjun Kweon, Kuk-Jin Yoon

TL;DR

TALoS is introduced, a novel test-time adaptation approach for SSC that excavates the information available in driving environments and aggregate reliable SSC predictions among multiple moments and leverage them as semantic pseudo-GT for adaptation.

Abstract

Semantic Scene Completion (SSC) aims to perform geometric completion and semantic segmentation simultaneously. Despite the promising results achieved by existing studies, the inherently ill-posed nature of the task presents significant challenges in diverse driving scenarios. This paper introduces TALoS, a novel test-time adaptation approach for SSC that excavates the information available in driving environments. Specifically, we focus on that observations made at a certain moment can serve as Ground Truth (GT) for scene completion at another moment. Given the characteristics of the LiDAR sensor, an observation of an object at a certain location confirms both 1) the occupation of that location and 2) the absence of obstacles along the line of sight from the LiDAR to that point. TALoS utilizes these observations to obtain self-supervision about occupancy and emptiness, guiding the model to adapt to the scene in test time. In a similar manner, we aggregate reliable SSC predictions among multiple moments and leverage them as semantic pseudo-GT for adaptation. Further, to leverage future observations that are not accessible at the current time, we present a dual optimization scheme using the model in which the update is delayed until the future observation is available. Evaluations on the SemanticKITTI validation and test sets demonstrate that TALoS significantly improves the performance of the pre-trained SSC model. Our code is available at https://github.com/blue-531/TALoS.

TALoS: Enhancing Semantic Scene Completion via Test-time Adaptation on the Line of Sight

TL;DR

TALoS is introduced, a novel test-time adaptation approach for SSC that excavates the information available in driving environments and aggregate reliable SSC predictions among multiple moments and leverage them as semantic pseudo-GT for adaptation.

Abstract

Semantic Scene Completion (SSC) aims to perform geometric completion and semantic segmentation simultaneously. Despite the promising results achieved by existing studies, the inherently ill-posed nature of the task presents significant challenges in diverse driving scenarios. This paper introduces TALoS, a novel test-time adaptation approach for SSC that excavates the information available in driving environments. Specifically, we focus on that observations made at a certain moment can serve as Ground Truth (GT) for scene completion at another moment. Given the characteristics of the LiDAR sensor, an observation of an object at a certain location confirms both 1) the occupation of that location and 2) the absence of obstacles along the line of sight from the LiDAR to that point. TALoS utilizes these observations to obtain self-supervision about occupancy and emptiness, guiding the model to adapt to the scene in test time. In a similar manner, we aggregate reliable SSC predictions among multiple moments and leverage them as semantic pseudo-GT for adaptation. Further, to leverage future observations that are not accessible at the current time, we present a dual optimization scheme using the model in which the update is delayed until the future observation is available. Evaluations on the SemanticKITTI validation and test sets demonstrate that TALoS significantly improves the performance of the pre-trained SSC model. Our code is available at https://github.com/blue-531/TALoS.

Paper Structure

This paper contains 30 sections, 9 equations, 7 figures, 14 tables, 1 algorithm.

Figures (7)

  • Figure 1: Left: Visualization of constructing a binary map $\mathbf{V}^{comp}_{j\rightarrow i}$ from the transformed point cloud $\mathbf{X}_{j\rightarrow i}$. Although we represent our process using a 2D grid for intuitive visualization, note that the real process is performed on a 3D voxel. Right: The real example of the binary map projected on 2D.
  • Figure 2: Left: Verification of the reliability metric. The voxels having higher reliability show higher semantic completion accuracy. Right: Examples of pseudo-GT (pGT) construction. The blue box depicts the successful rejection of misprediction using reliability, while the red boxes show the benefit of using the prediction of another moment, providing more completed pGT.
  • Figure 3: Conceptual visualization of the dual optimization scheme. $\mathcal{F}^M$ is instantly updated at moment $i$, using the past information provided from $j$th moment. On the other hand, the update of $\mathcal{F}^G$ using $i$th prediction is delayed until $k$th moment, when the future information becomes available. We unify the predictions of the models, $\mathbf{p}^M_i$ and $\mathbf{p}^G_i$, to get the final prediction $\mathbf{p}^{talos}_i$. The red dashed line denotes the back-propagation.
  • Figure 4: Qualitative comparisons between baseline (SCPNet) and ours TALoS on SemanticKITTI val set. The highlighted regions depict the improvements achieved by TALoS, better completing the scene while also recovering the mispredictions.
  • Figure 5: Results of the playback exp.
  • ...and 2 more figures