Deep Sketch-Based 3D Modeling: A Survey

Alberto Tono; Jiajun Wu; Gordon Wetzstein; Iro Armeni; Hariharan Subramonyam; James Landay; Martin Fischer

Deep Sketch-Based 3D Modeling: A Survey

Alberto Tono, Jiajun Wu, Gordon Wetzstein, Iro Armeni, Hariharan Subramonyam, James Landay, Martin Fischer

TL;DR

A comprehensive survey of the latest DS‐3DM within a novel design space, highlighting limitations and identifying opportunities for interdisciplinary research in computer vision, computer graphics, and human–computer interaction, revealing a need for controllability and information‐rich outputs.

Abstract

In the past decade, advances in artificial intelligence have revolutionized sketch-based 3D modeling, leading to a new paradigm known as Deep Sketch-Based 3D Modeling (DS-3DM). DS-3DM offers data-driven methods that address the long-standing challenges of sketch abstraction and ambiguity. DS-3DM keeps humans at the center of the creative process by enhancing the flexibility, usability, faithfulness, and adaptability of sketch-based 3D modeling interfaces. This paper contributes a comprehensive survey of the latest DS-3DM within a novel design space: MORPHEUS. Built upon the Input-Model-Output (IMO) framework, MORPHEUS categorizes Models outputting Options of 3D Representations and Parts, derived from Human inputs (varying in quantity and modality), and Evaluated across diverse User-views and Styles. Throughout MORPHEUS we highlight limitations and identify opportunities for interdisciplinary research in Computer Vision, Computer Graphics, and Human-Computer Interaction, revealing a need for controllability and information-rich outputs. These opportunities align design processes more closely with user' intent, responding to the growing importance of user-centered approaches.

Deep Sketch-Based 3D Modeling: A Survey

TL;DR

Abstract

Paper Structure (28 sections, 12 figures, 8 tables)

This paper contains 28 sections, 12 figures, 8 tables.

Introduction
Scope
A Design Space for Deep Sketch-Based 3D Modeling
Input: $\boldsymbol{I}_{sketch}$
Amount
View
Style
Input Summary
Generative AI Model
Neural Models:
Deep Generative Models and/or Implicit Rep.:
Diffusion Models:
Transformer-Based Models:
Differentiable Rendering Models:
Pre-Trained Optimization-Based Models (Found.):
...and 13 more sections

Figures (12)

Figure 1: The initial sketch in the architectural design process contains partial massing, volumetric, and geometric information TONOVitruvio22. This early representation is incomplete because it represents a building only from its front view, conveying only partial information lacking a comprehensive 3D understanding. This single perspective leaves out other buildings' details, particularly the rear Front2Back2020. Moreover, this missing information is influenced by extrinsic and intrinsic factors. Extrinsic factors rely on contextual information such as location, surrounding neighborhood, and local climate. For example, the building will have different shapes if it is designed in New York, Padua, or Palo Alto. Intrinsic factors, such as building typology (office, school, house), design constraints, materials, textures, and appearance reside in the architect's vision of how to translate client requirements into physical spaces.
Figure 2: Illustrates $\textsc{MORPHEUS}$ and its overall structure. The input is divided into three main aspects: (1) the quantity of sketches, including multiple sketches, single sketches, and single or multiple sketches with additional information (e.g., text); (2) the viewpoint, specifying whether a fixed viewpoint, learned camera parameters, or a view-independent approach is used; and (3) the style, categorized as fixed sketching styles, style adapters, or flexible styles. The model section highlights six key techniques—neural models, implicit representations, diffusion models, differentiable renderers, transformers, and foundation models—with methods often combining multiple techniques. The output is divided into: (1) parts and semantics, encompassing individual element divisions, part-based segmentation, and parts with related semantics (e.g., material properties); (2) geometric genus, which ranges from fixed representations to flexible genus and enriched geometry; and (3) options, indicating whether the method produces a single output, multiple outputs, or multiple outputs with additional information.
Figure 3: $\textsc{DS-3DM}$ are designed to accommodate a diverse array of sketching styles Xiao2022DifferSketching and viewpoints sketchstyle, including bird’s-eye, street-level, front, top, side, axonometric, and perspective views WherepeopledrawlinesCole2008. These methods are robust to variations in sketch characteristics, whether the lines are wavy or straight, single or multiple, and whether the sketches are shaded or unshaded.
Figure 4: BuildingGAN buildingan with different sketching styles. CLIPASSO vinker2022clipasso, Apparent Ridges Apparentridges07 (AE), Canny Edge canny1996edgedetection (CE), Hollistically Nested-Edge xie2015holistically (HED), Sobel Edge sobel_operator (SE). These have been generated using OpenCV, and filters have been applied to the initial render.
Figure 5: Sketch-Modeling Deep Learning Based Methods. This timeline illustrates the evolution of $\textsc{DS-3DM}$ methods, highlighting key innovations and breakthroughs in the field. Methods are organized chronologically and color-coded by year to facilitate cross-referencing with Table \ref{['tab:model table']}, enabling identification of specific research patterns and emerging directions.
...and 7 more figures

Deep Sketch-Based 3D Modeling: A Survey

TL;DR

Abstract

Deep Sketch-Based 3D Modeling: A Survey

Authors

TL;DR

Abstract

Table of Contents

Figures (12)