Table of Contents
Fetching ...

RGI-Net: 3D Room Geometry Inference from Room Impulse Responses With Hidden First-Order Reflections

Inmo Yeon, Jung-Woo Choi

TL;DR

RGI-Net learns and exploits complex relationships between low-order and high-order reflections in RIRs and, thus, can estimate room shapes even when the shape is non-convex or first-order reflections are missing in the RIRs.

Abstract

Room geometry is important prior information for implementing realistic 3D audio rendering. For this reason, various room geometry inference (RGI) methods have been developed by utilizing the time-of-arrival (TOA) or time-difference-of-arrival (TDOA) information in room impulse responses (RIRs). However, the conventional RGI technique poses several assumptions, such as convex room shapes, the number of walls known in priori, and the visibility of first-order reflections. In this work, we introduce the RGI-Net which can estimate room geometries without the aforementioned assumptions. RGI-Net learns and exploits complex relationships between low-order and high-order reflections in RIRs and, thus, can estimate room shapes even when the shape is non-convex or first-order reflections are missing in the RIRs. RGI-Net includes the evaluation network that separately evaluates the presence probability of walls, so the geometry inference is possible without prior knowledge of the number of walls.

RGI-Net: 3D Room Geometry Inference from Room Impulse Responses With Hidden First-Order Reflections

TL;DR

RGI-Net learns and exploits complex relationships between low-order and high-order reflections in RIRs and, thus, can estimate room shapes even when the shape is non-convex or first-order reflections are missing in the RIRs.

Abstract

Room geometry is important prior information for implementing realistic 3D audio rendering. For this reason, various room geometry inference (RGI) methods have been developed by utilizing the time-of-arrival (TOA) or time-difference-of-arrival (TDOA) information in room impulse responses (RIRs). However, the conventional RGI technique poses several assumptions, such as convex room shapes, the number of walls known in priori, and the visibility of first-order reflections. In this work, we introduce the RGI-Net which can estimate room geometries without the aforementioned assumptions. RGI-Net learns and exploits complex relationships between low-order and high-order reflections in RIRs and, thus, can estimate room shapes even when the shape is non-convex or first-order reflections are missing in the RIRs. RGI-Net includes the evaluation network that separately evaluates the presence probability of walls, so the geometry inference is possible without prior knowledge of the number of walls.
Paper Structure (9 sections, 2 equations, 3 figures, 2 tables)

This paper contains 9 sections, 2 equations, 3 figures, 2 tables.

Figures (3)

  • Figure 1: Overview of the RGI-Net architecture. $M$, $N$, and $W_0$ denote the number of channels and temporal length of RIRs, and the maximum number of walls, respectively.
  • Figure 2: Top view of rooms reconstructed from estimated wall parameters. Since four distinct L-shaped rooms can be formed by the estimated planes, (c) and (d) were reconstructed considering the GT room shapes. The black dot (left) denotes the position of an audio device. The black dashed lines and blue solid lines (right) correspond to reconstructed walls from the GT and inferred wall parameters, respectively.
  • Figure 3: Activation maps of multichannel RIRs displaying the use of high-order reflections for geometry inference. (a) convex pentagonal room and (b) non-convex L-NLOS room.