Table of Contents
Fetching ...

Predicting High-precision Depth on Low-Precision Devices Using 2D Hilbert Curves

Mykhailo Uss, Ruslan Yermolenko, Oleksii Shashko, Olena Kolodiazhna, Ivan Safonov, Volodymyr Savin, Yoonjae Yeo, Seowon Ji, Jaeyun Jeong

TL;DR

This work tackles the limitation of high dynamic range depth prediction on devices with low-precision arithmetic by encoding depth as two components on a 2D Hilbert curve. A full-precision network is trained to predict these Hilbert components, and a lightweight LUT-based post-processing step on CPU reconstructs high-precision depth from low-bit predictions, effectively increasing depth bit-width by up to $\log_2 L$ bits. Experiments on stereo depth (DispNet and DPT variants) show that eight-bit quantized models with Hilbert-component prediction can match or exceed the quality of higher-precision baselines while delivering on-device speedups and energy savings, accompanied by substantial quantization-error reductions (up to $4.6\times$ on DSP). The approach is hardware-friendly, relies on standard quantization methods (PTQ/QAT), and generalizes to monocular depth, depth completion, and related dense-prediction tasks, offering a practical path to HDR depth on resource-limited devices.

Abstract

Dense depth prediction deep neural networks (DNN) have achieved impressive results for both monocular and binocular data, but still they are limited by high computational complexity, restricting their use on low-end devices. For better on-device efficiency and hardware utilization, weights and activations of the DNN should be converted to low-bit precision. However, this precision is not sufficient to represent high dynamic range depth. In this paper, we aim to overcome this limitation and restore high-precision depth from low-bit precision predictions. To achieve this, we propose to represent high dynamic range depth as two low dynamic range components of a Hilbert curve, and to train the full-precision DNN to directly predict the latter. For on-device deployment, we use standard quantization methods and add a post-processing step that reconstructs depth from the Hilbert curve components predicted in low-bit precision. Extensive experiments demonstrate that our method increases the bit precision of predicted depth by up to three bits with little computational overhead. We also observed a positive side effect of quantization error reduction by up to 4.6 times. Our method enables effective and accurate depth prediction with DNN weights and activations quantized to eight-bit precision.

Predicting High-precision Depth on Low-Precision Devices Using 2D Hilbert Curves

TL;DR

This work tackles the limitation of high dynamic range depth prediction on devices with low-precision arithmetic by encoding depth as two components on a 2D Hilbert curve. A full-precision network is trained to predict these Hilbert components, and a lightweight LUT-based post-processing step on CPU reconstructs high-precision depth from low-bit predictions, effectively increasing depth bit-width by up to bits. Experiments on stereo depth (DispNet and DPT variants) show that eight-bit quantized models with Hilbert-component prediction can match or exceed the quality of higher-precision baselines while delivering on-device speedups and energy savings, accompanied by substantial quantization-error reductions (up to on DSP). The approach is hardware-friendly, relies on standard quantization methods (PTQ/QAT), and generalizes to monocular depth, depth completion, and related dense-prediction tasks, offering a practical path to HDR depth on resource-limited devices.

Abstract

Dense depth prediction deep neural networks (DNN) have achieved impressive results for both monocular and binocular data, but still they are limited by high computational complexity, restricting their use on low-end devices. For better on-device efficiency and hardware utilization, weights and activations of the DNN should be converted to low-bit precision. However, this precision is not sufficient to represent high dynamic range depth. In this paper, we aim to overcome this limitation and restore high-precision depth from low-bit precision predictions. To achieve this, we propose to represent high dynamic range depth as two low dynamic range components of a Hilbert curve, and to train the full-precision DNN to directly predict the latter. For on-device deployment, we use standard quantization methods and add a post-processing step that reconstructs depth from the Hilbert curve components predicted in low-bit precision. Extensive experiments demonstrate that our method increases the bit precision of predicted depth by up to three bits with little computational overhead. We also observed a positive side effect of quantization error reduction by up to 4.6 times. Our method enables effective and accurate depth prediction with DNN weights and activations quantized to eight-bit precision.
Paper Structure (21 sections, 4 equations, 22 figures, 3 tables)

This paper contains 21 sections, 4 equations, 22 figures, 3 tables.

Figures (22)

  • Figure 1: Illustration of DispNet DispNet quantization to INT8 precision (W8A8). Running inference of the quantized model on Qualcomm Hexagon DSP results in depth precision loss and quantization artifacts (c). Our method increases depth bit-width and reduces quantization error (d).
  • Figure 2: Idea illustration. 1D range is quantized to $N=8$ values $q_0=0\ldots q_{N-1}=1$ marked by white circles. The 1D range is mapped to a 2D curve shown in red color. Both $x$ and $y$ axes are also quantized into $N=8$ values yielding 64 2D points. Among them, 36 points lie on the curve (shown in blue color). Mapping the 2D curve back to the 1D range results in 36 different quantization values. Quantization error has effectively been reduced by the factor equal to the curve length $L=35/7=5$.
  • Figure 3: Scheme of the proposed method's inference pipeline on device.
  • Figure 4: Illustration of disparity transforms: (a) disparity map; (b) mapping to 2D with second order Hilbert curve; (c, d) $x$ and $y$ components of the Hilbert curve; (e, f) coarse and fine details of disparity map. Fine details in (f) are the least significant byte of disparity (a) represented in 16-bit format. High-frequency oscillations make it appear different from the original disparity and difficult to predict by a DNN model.
  • Figure 5: Hilbert curves for orders $p=1,2,3,4$ (from left to right). Every order is formed by the replacement of every node by an elementary 3-segment sequence.
  • ...and 17 more figures