Table of Contents
Fetching ...

Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors

Jae Joong Lee, Bosheng Li, Sara Beery, Jonathan Huang, Songlin Fei, Raymond A. Yeh, Bedrich Benes

TL;DR

Tree-D Fusion tackles the scarcity of large-scale, realistic 3D tree data by reconstructing simulation-ready trees from single images using genus-conditioned diffusion priors and a developmental space-colonization model. It trains 2D priors on Auto Arborist images and a 3D prior on synthetic trees to produce a detailed 3D envelope, which is expanded into a full branching structure. The approach achieves state-of-the-art realism and geometric fidelity across 600k models, enabling scalable forestry analysis, urban planning, and AR visualization, while maintaining the ability to simulate growth and environmental interactions. Limitations include sensitivity to asymmetric shapes and leaf occlusion, with future work targeting broader genus coverage and improved occlusion handling.

Abstract

We introduce Tree D-fusion, featuring the first collection of 600,000 environmentally aware, 3D simulation-ready tree models generated through Diffusion priors. Each reconstructed 3D tree model corresponds to an image from Google's Auto Arborist Dataset, comprising street view images and associated genus labels of trees across North America. Our method distills the scores of two tree-adapted diffusion models by utilizing text prompts to specify a tree genus, thus facilitating shape reconstruction. This process involves reconstructing a 3D tree envelope filled with point markers, which are subsequently utilized to estimate the tree's branching structure using the space colonization algorithm conditioned on a specified genus.

Tree-D Fusion: Simulation-Ready Tree Dataset from Single Images with Diffusion Priors

TL;DR

Tree-D Fusion tackles the scarcity of large-scale, realistic 3D tree data by reconstructing simulation-ready trees from single images using genus-conditioned diffusion priors and a developmental space-colonization model. It trains 2D priors on Auto Arborist images and a 3D prior on synthetic trees to produce a detailed 3D envelope, which is expanded into a full branching structure. The approach achieves state-of-the-art realism and geometric fidelity across 600k models, enabling scalable forestry analysis, urban planning, and AR visualization, while maintaining the ability to simulate growth and environmental interactions. Limitations include sensitivity to asymmetric shapes and leaf occlusion, with future work targeting broader genus coverage and improved occlusion handling.

Abstract

We introduce Tree D-fusion, featuring the first collection of 600,000 environmentally aware, 3D simulation-ready tree models generated through Diffusion priors. Each reconstructed 3D tree model corresponds to an image from Google's Auto Arborist Dataset, comprising street view images and associated genus labels of trees across North America. Our method distills the scores of two tree-adapted diffusion models by utilizing text prompts to specify a tree genus, thus facilitating shape reconstruction. This process involves reconstructing a 3D tree envelope filled with point markers, which are subsequently utilized to estimate the tree's branching structure using the space colonization algorithm conditioned on a specified genus.
Paper Structure (13 sections, 7 equations, 13 figures, 5 tables)

This paper contains 13 sections, 7 equations, 13 figures, 5 tables.

Figures (13)

  • Figure 1: Tree-D Fusion takes a single view image (left) and reconstructs a 3D simulation-ready tree model. The tree model can be used to simulate growth over time with a detailed branching structure with leaves. We provide a dataset of 3D reconstructed tree models from 600,000 Google Street View images.
  • Figure 2: Magic123 qian2023magic123 fails to capture trees' complex geometry.
  • Figure 3: The input to Tree-D Fusion is an RGB image of a tree and its genus. To perform shape reconstruction, we minimize the loss function w.r.t. the NeRF parameter $\theta$. The loss function is constructed from two diffusion models, StableDiffusion with Lora and Zero123, trained on real tree images and synthetic 3D tree models. The output is an optimized NeRF $\tau(\theta^*)$, which is a detailed 3D tree envelope. We then populate the volume of $\tau(\theta^*)$ by markers based on the envelope and reconstruct trees by genus-conditioned space colonization algorithm.
  • Figure 4: Models of trees and the phenotypic characteristics derived from them: height and Diameter at Breast Height (DBH). The radius depicted illustrates the projected amount of shade.
  • Figure 5: The input image (a) is reconstructed into a digital twin (b-c) that responds to the environment such as the proximity to a wall (d-e). Growing three copies of the same tree shows their completion for space (f-g).
  • ...and 8 more figures