Table of Contents
Fetching ...

SuperDec: 3D Scene Decomposition with Superquadric Primitives

Elisabetta Fedele, Boyang Sun, Leonidas Guibas, Marc Pollefeys, Francis Engelmann

TL;DR

SuperDec introduces a locality-driven 3D scene decomposition that represents arbitrary scenes as a compact set of superquadrics. A two-stage pipeline—a Transformer-based feed-forward network predicting per-object superquadrics and a Levenberg–Marquardt refinement—enables accurate, parsimonious decomposition, which is scalable to full scenes via Mask3D. The approach achieves state-of-the-art object-level decomposition on ShapeNet and generalizes to real datasets (ScanNet++ and Replica) without fine-tuning, while supporting robotics tasks and controllable image generation. This compact, interpretable representation enables efficient planning, grasping, and geometry-guided editing, signaling a practical path toward geometry-aware 3D scene understanding.

Abstract

We present SuperDec, an approach for creating compact 3D scene representations via decomposition into superquadric primitives. While most recent works leverage geometric primitives to obtain photorealistic 3D scene representations, we propose to leverage them to obtain a compact yet expressive representation. We propose to solve the problem locally on individual objects and leverage the capabilities of instance segmentation methods to scale our solution to full 3D scenes. In doing that, we design a new architecture which efficiently decompose point clouds of arbitrary objects in a compact set of superquadrics. We train our architecture on ShapeNet and we prove its generalization capabilities on object instances extracted from the ScanNet++ dataset as well as on full Replica scenes. Finally, we show how a compact representation based on superquadrics can be useful for a diverse range of downstream applications, including robotic tasks and controllable visual content generation and editing.

SuperDec: 3D Scene Decomposition with Superquadric Primitives

TL;DR

SuperDec introduces a locality-driven 3D scene decomposition that represents arbitrary scenes as a compact set of superquadrics. A two-stage pipeline—a Transformer-based feed-forward network predicting per-object superquadrics and a Levenberg–Marquardt refinement—enables accurate, parsimonious decomposition, which is scalable to full scenes via Mask3D. The approach achieves state-of-the-art object-level decomposition on ShapeNet and generalizes to real datasets (ScanNet++ and Replica) without fine-tuning, while supporting robotics tasks and controllable image generation. This compact, interpretable representation enables efficient planning, grasping, and geometry-guided editing, signaling a practical path toward geometry-aware 3D scene understanding.

Abstract

We present SuperDec, an approach for creating compact 3D scene representations via decomposition into superquadric primitives. While most recent works leverage geometric primitives to obtain photorealistic 3D scene representations, we propose to leverage them to obtain a compact yet expressive representation. We propose to solve the problem locally on individual objects and leverage the capabilities of instance segmentation methods to scale our solution to full 3D scenes. In doing that, we design a new architecture which efficiently decompose point clouds of arbitrary objects in a compact set of superquadrics. We train our architecture on ShapeNet and we prove its generalization capabilities on object instances extracted from the ScanNet++ dataset as well as on full Replica scenes. Finally, we show how a compact representation based on superquadrics can be useful for a diverse range of downstream applications, including robotic tasks and controllable visual content generation and editing.

Paper Structure

This paper contains 40 sections, 14 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: 3D Scene Decomposition with Superquadrics. Given a 3D point cloud of an arbitrary scene, SuperDec decomposes all scene objects into a compact set of superquadric primitives.
  • Figure 2: Illustration of the SuperDec Model. Given a point cloud of an object with $N$ points, a Transformer-based neural network predicts parameters for $P$ superquadrics, as well as a soft segmentation matrix that assigns points to superquadrics. The predicted parameters include the 11 superquadric parameters and an objectness score. These predictions provide an effective initialization for the subsequent Levenberg–Marquardt (LM) optimization, which refines the superquadrics.
  • Figure 3: Qualitative Results on ShapeNet shapenet2015. We show results on test samples for in-category (four first columns) classes and out-of-category classes (two last columns). The latter were not seen during training and illustrate how well models generalize to novel classes.
  • Figure 4: Grasping Result. Visualization of computed grasp poses for a milk bottle, some flowers, a side table, and a plant.
  • Figure 5: Real-world robot experiment. The top row shows the input scan (left) and the representation from SuperDec with the computed path and grasping pose (right). The bottom row illustrates the robot following the planned path. We denote the starting point of the path with a green sphere, and the target location with a red sphere. The target object (a milk bottle) is circled in red.
  • ...and 6 more figures