Table of Contents
Fetching ...

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Sinisa Stekovic, Arslan Artykov, Stefan Ainetter, Mattia D'Urso, Friedrich Fraundorfer

TL;DR

PyTorchGeoNodes presents a differentiable pipeline that converts Blender shape programs into gradient-friendly PyTorch graphs, enabling end-to-end optimization of both continuous and discrete shape parameters for 3D reconstruction from RGB-D data. A genetic algorithm drives discrete parameter estimation while gradient-based refinement optimizes continuous parameters, and the framework is extended with Gaussian splatting to capture fine details. The method achieves accurate object parameter recovery on real ScanNet scenes and demonstrates competitive performance against baselines, with demonstrated integration of procedural shapes and Gaussians. This work advances interpretable, compact, and editable 3D reconstruction by uniting procedural modeling with differentiable optimization and a Blender-to-PyTorch compiler.

Abstract

We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects and their parameters from images using interpretable shape programs. Unlike traditional CAD model retrieval, shape programs allow reasoning about semantic parameters, editing, and a low memory footprint. Despite their potential, shape programs for 3D scene understanding have been largely overlooked. Our key contribution is enabling gradient-based optimization by parsing shape programs, or more precisely procedural models designed in Blender, into efficient PyTorch code. While there are many possible applications of our PyTochGeoNodes, we show that a combination of PyTorchGeoNodes with genetic algorithm is a method of choice to optimize both discrete and continuous shape program parameters for 3D reconstruction and understanding of 3D object parameters. Our modular framework can be further integrated with other reconstruction algorithms, and we demonstrate one such integration to enable procedural Gaussian splatting. Our experiments on the ScanNet dataset show that our method achieves accurate reconstructions while enabling, until now, unseen level of 3D scene understanding.

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

TL;DR

PyTorchGeoNodes presents a differentiable pipeline that converts Blender shape programs into gradient-friendly PyTorch graphs, enabling end-to-end optimization of both continuous and discrete shape parameters for 3D reconstruction from RGB-D data. A genetic algorithm drives discrete parameter estimation while gradient-based refinement optimizes continuous parameters, and the framework is extended with Gaussian splatting to capture fine details. The method achieves accurate object parameter recovery on real ScanNet scenes and demonstrates competitive performance against baselines, with demonstrated integration of procedural shapes and Gaussians. This work advances interpretable, compact, and editable 3D reconstruction by uniting procedural modeling with differentiable optimization and a Blender-to-PyTorch compiler.

Abstract

We propose PyTorchGeoNodes, a differentiable module for reconstructing 3D objects and their parameters from images using interpretable shape programs. Unlike traditional CAD model retrieval, shape programs allow reasoning about semantic parameters, editing, and a low memory footprint. Despite their potential, shape programs for 3D scene understanding have been largely overlooked. Our key contribution is enabling gradient-based optimization by parsing shape programs, or more precisely procedural models designed in Blender, into efficient PyTorch code. While there are many possible applications of our PyTochGeoNodes, we show that a combination of PyTorchGeoNodes with genetic algorithm is a method of choice to optimize both discrete and continuous shape program parameters for 3D reconstruction and understanding of 3D object parameters. Our modular framework can be further integrated with other reconstruction algorithms, and we demonstrate one such integration to enable procedural Gaussian splatting. Our experiments on the ScanNet dataset show that our method achieves accurate reconstructions while enabling, until now, unseen level of 3D scene understanding.
Paper Structure (13 sections, 7 equations, 6 figures, 1 table)

This paper contains 13 sections, 7 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Our PyTorchGeoNodes is a framework that enables differentiable shape programs for 3D reconstruction. Our semantically-rich reconstructions of 3D objects from RGB-D scans enable understanding of important metric parameters of individual parts for different objects. These parameters are physically bound to the 3D reconstruction as PyTorchGeoNodes enables flow of gradients from the generated shape to the parameters of the object. Our PyTorchGeoNodes is modular, e.g. we can use it to enable procedural modeling for Gaussian splatting. In the example, our method recovers Gaussians for the target sofa and 'clones' all details from the visible armrest to the occluded armrest in the scene. Our integration enables procedural editing of Gaussians and we can alter shape parameters to edit the reconstruction: e.g. by extending or shrinking width of the seat, removing backrest, left armrest, changing height of the backrest and depth of the seat etc.
  • Figure 2: A procedural model in Blender generating dividing boards of a cabinet. The computational graph is designed and visualized in Blender using Geometry Nodes feature. Underneath, we show how we abstract the different nodes using PyTorch tensors and PyTorch3D meshes. The input node takes input parameters, here {Width: 0.5, Dividing Board Thickness: 0.04, Height: 0.6, Number of Dividing Boards: 5, Board Thickness: 0.04} and feeds them to a series of operations. The blue nodes are arithmetic and concatenation nodes, which transform input parameters and feed the results to geometry nodes, in green. In this example, we generate a cuboid mesh and instantiate a line of points which generates the final geometry for dividing boards. In practice, this shape program is part of a larger shape program for modeling cabinets.
  • Figure 3: Fitting shape parameters with PyTorchGeoNodes. At every iteration, our genetic algorithm generates indiviudals of shape parameters and object poses. We parse 3D shapes from individual parameters using PyTorchGeoNodes and perform shape parameter optimization based on loss that evaluates how well the shape fits the input scene. In the next iteration, the optimized set of individuals is used to update the pool of individuals based on selection of the fittest principle.
  • Figure 4: We integrate Gaussians directly into shape programs and PyTorchGeoNodes, to have the advantages of both procedural models and Gaussian splatting.
  • Figure 5: Shape from recovered object parameters aligns very well with the target object also when projected into sampled images of the scene. In the first row, we reconstruct a variety of chairs along with many detailed parameters including seat thickness, thickness and offsets of side and front/back support legs, existence of different semantic elements like 'star-shaped' legs and its exact rotation. In the second row, we correctly estimate shape of table top and its measurements, including width, depth and thickness, parameters of table legs, and existence of middle shelf including its offset from the bottom. In the third row we reconstruct parameters of 'L-element' of sofa including its depth and width, existence and size of legs, and existence of individual armrests, backrest, and the corresponding measurements. We show more results in the supplementary material.
  • ...and 1 more figures