Enhancement of 3D Camera Synthetic Training Data with Noise Models
Katarína Osvaldová, Lukáš Gajdošech, Viktor Kocur, Martin Madaras
TL;DR
This paper addresses the domain gap between synthetic and real 3D camera data by modeling two principal noise components—lateral in-image-plane noise and axial depth noise—and estimating their dependence on object distance and surface angle from a custom dataset. The authors fit quadratic noise models for each device and use these to augment synthetic training data for a UNet-based object segmentation task, showing that training with slightly higher noise than estimated ($M_n \approx 1.25$) yields the best real-world generalization. They validate the approach on real Armadillo scans across multiple distances and demonstrate that both under- and over-noising can harm performance, emphasizing the value of device-specific noise models for realistic synthetic data generation. The work provides practical noise-modeling tools and a data-sharing setup that can improve the robustness of depth-based neural networks in real-world applications while outlining avenues for extending noise types and representations.
Abstract
The goal of this paper is to assess the impact of noise in 3D camera-captured data by modeling the noise of the imaging process and applying it on synthetic training data. We compiled a dataset of specifically constructed scenes to obtain a noise model. We specifically model lateral noise, affecting the position of captured points in the image plane, and axial noise, affecting the position along the axis perpendicular to the image plane. The estimated models can be used to emulate noise in synthetic training data. The added benefit of adding artificial noise is evaluated in an experiment with rendered data for object segmentation. We train a series of neural networks with varying levels of noise in the data and measure their ability to generalize on real data. The results show that using too little or too much noise can hurt the networks' performance indicating that obtaining a model of noise from real scanners is beneficial for synthetic data generation.
