ROSA: Reconstructing Object Shape and Appearance Textures by Adaptive Detail Transfer
Julian Kaltheuner, Patrick Stotko, Reinhard Klein
TL;DR
ROSA addresses the ill-posed problem of reconstructing an object's shape and SVBRDF from limited collocated-light images by combining adaptive mesh refinement guided by texture normals and curvature with a tile-based, decoder-generated high-resolution appearance texture. A novel normal loss transfers fine geometric details from the texture normals to the mesh, while curvature-driven controls govern local refinement and smoothing to keep the representation compact. Appearance textures are generated via a tile-based approach using a pre-trained autoencoder, enabling arbitrary-resolution textures without enlarging the base mesh. Quantitative and qualitative evaluations on synthetic and real data show ROSA delivers high-fidelity reconstructions with more compact meshes and realistic highlights compared to prior methods, facilitating accurate relighting and rendering.
Abstract
Reconstructing an object's shape and appearance in terms of a mesh textured by a spatially-varying bidirectional reflectance distribution function (SVBRDF) from a limited set of images captured under collocated light is an ill-posed problem. Previous state-of-the-art approaches either aim to reconstruct the appearance directly on the geometry or additionally use texture normals as part of the appearance features. However, this requires detailed but inefficiently large meshes, that would have to be simplified in a post-processing step, or suffers from well-known limitations of normal maps such as missing shadows or incorrect silhouettes. Another limiting factor is the fixed and typically low resolution of the texture estimation resulting in loss of important surface details. To overcome these problems, we present ROSA, an inverse rendering method that directly optimizes mesh geometry with spatially adaptive mesh resolution solely based on the image data. In particular, we refine the mesh and locally condition the surface smoothness based on the estimated normal texture and mesh curvature. In addition, we enable the reconstruction of fine appearance details in high-resolution textures through a pioneering tile-based method that operates on a single pre-trained decoder network but is not limited by the network output resolution.
