Table of Contents
Fetching ...

Object-Centric 3D Gaussian Splatting for Strawberry Plant Reconstruction and Phenotyping

Jiajia Li, Keyi Zhu, Qianwen Zhang, Dong Chen, Qi Sun, Zhaojian Li

TL;DR

This work tackles the challenge of non-destructive, high-throughput strawberry phenotyping by developing an object-centric 3D Gaussian Splatting (3DGS) framework that uses SAM-2 foreground masks to suppress background noise during reconstruction. The pipeline integrates RGBA-based loss masking, opacity-guided Gaussian culling, and background randomization, with scale calibration via a known calibration cube and trait extraction through DBSCAN clustering and PCA to obtain plant height and canopy width. Compared with NeRF-based baselines, the method delivers superior reconstruction quality (e.g., PSNR and SSIM) while reducing training time and memory usage, and enables centimeter-level accuracy in key traits, facilitating automated, scalable phenotyping. The approach holds significant practical impact for agricultural monitoring and breeding, with potential extensions to other crops and field-scale deployments through hierarchical or distributed reconstruction strategies.

Abstract

Strawberries are among the most economically significant fruits in the United States, generating over $2 billion in annual farm-gate sales and accounting for approximately 13% of the total fruit production value. Plant phenotyping plays a vital role in selecting superior cultivars by characterizing plant traits such as morphology, canopy structure, and growth dynamics. However, traditional plant phenotyping methods are time-consuming, labor-intensive, and often destructive. Recently, neural rendering techniques, notably Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS), have emerged as powerful frameworks for high-fidelity 3D reconstruction. By capturing a sequence of multi-view images or videos around a target plant, these methods enable non-destructive reconstruction of complex plant architectures. Despite their promise, most current applications of 3DGS in agricultural domains reconstruct the entire scene, including background elements, which introduces noise, increases computational costs, and complicates downstream trait analysis. To address this limitation, we propose a novel object-centric 3D reconstruction framework incorporating a preprocessing pipeline that leverages the Segment Anything Model v2 (SAM-2) and alpha channel background masking to achieve clean strawberry plant reconstructions. This approach produces more accurate geometric representations while substantially reducing computational time. With a background-free reconstruction, our algorithm can automatically estimate important plant traits, such as plant height and canopy width, using DBSCAN clustering and Principal Component Analysis (PCA). Experimental results show that our method outperforms conventional pipelines in both accuracy and efficiency, offering a scalable and non-destructive solution for strawberry plant phenotyping.

Object-Centric 3D Gaussian Splatting for Strawberry Plant Reconstruction and Phenotyping

TL;DR

This work tackles the challenge of non-destructive, high-throughput strawberry phenotyping by developing an object-centric 3D Gaussian Splatting (3DGS) framework that uses SAM-2 foreground masks to suppress background noise during reconstruction. The pipeline integrates RGBA-based loss masking, opacity-guided Gaussian culling, and background randomization, with scale calibration via a known calibration cube and trait extraction through DBSCAN clustering and PCA to obtain plant height and canopy width. Compared with NeRF-based baselines, the method delivers superior reconstruction quality (e.g., PSNR and SSIM) while reducing training time and memory usage, and enables centimeter-level accuracy in key traits, facilitating automated, scalable phenotyping. The approach holds significant practical impact for agricultural monitoring and breeding, with potential extensions to other crops and field-scale deployments through hierarchical or distributed reconstruction strategies.

Abstract

Strawberries are among the most economically significant fruits in the United States, generating over $2 billion in annual farm-gate sales and accounting for approximately 13% of the total fruit production value. Plant phenotyping plays a vital role in selecting superior cultivars by characterizing plant traits such as morphology, canopy structure, and growth dynamics. However, traditional plant phenotyping methods are time-consuming, labor-intensive, and often destructive. Recently, neural rendering techniques, notably Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3DGS), have emerged as powerful frameworks for high-fidelity 3D reconstruction. By capturing a sequence of multi-view images or videos around a target plant, these methods enable non-destructive reconstruction of complex plant architectures. Despite their promise, most current applications of 3DGS in agricultural domains reconstruct the entire scene, including background elements, which introduces noise, increases computational costs, and complicates downstream trait analysis. To address this limitation, we propose a novel object-centric 3D reconstruction framework incorporating a preprocessing pipeline that leverages the Segment Anything Model v2 (SAM-2) and alpha channel background masking to achieve clean strawberry plant reconstructions. This approach produces more accurate geometric representations while substantially reducing computational time. With a background-free reconstruction, our algorithm can automatically estimate important plant traits, such as plant height and canopy width, using DBSCAN clustering and Principal Component Analysis (PCA). Experimental results show that our method outperforms conventional pipelines in both accuracy and efficiency, offering a scalable and non-destructive solution for strawberry plant phenotyping.

Paper Structure

This paper contains 19 sections, 16 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Framework of 3D Gaussian Splatting (3DGS) kerbl20233d. The process consists of four main stages: (a) 3D Gaussian initialization: multi-view images are used to generate a sparse point cloud via structure-from-motion (SfM), from which 3D Gaussians are initialized with parameters $(\alpha, \bm{c}, \bm{\mu}, \bm{\Sigma})$; (b) Splatting: the Gaussians are projected onto the image plane for differentiable rendering; (c) 3D Gaussian model optimization: the parameters are iteratively optimized by minimizing the discrepancy between rendered and ground-truth images; and (d) Density control: point pruning and densification maintain efficient and accurate scene representation.
  • Figure 2: Qualitative comparison of reconstructed strawberry plants using different NeRF-based and 3DGS-based models. From left to right: Ground Truth, Nerfacto, Mip-NeRF, Instant-NGP, Splatfacto, Splatfacto with post background removal (Splatfacto-PBR), and our proposed object-centric 3DGS. The red arrows highlight blurred or missing regions in NeRF-based reconstructions, while the red boxes emphasize finer structural details, such as leaf edges, petioles, and fruit surfaces, accurately preserved by our method.
  • Figure 3: Comparison of reconstructed point clouds between the baseline Splatfacto and the proposed object-centric 3DGS method. The left two columns show results from Splatfacto, which exhibit noisy and uneven point distributions due to background interference and redundant Gaussian primitives. The right columns display our method, which generates a cleaner, denser, and geometrically consistent point cloud with a clear separation between the plant and reference cube. The improved spatial organization and reduced background artifacts demonstrate the effectiveness of the integrated foreground masking and object-centric learning strategies.
  • Figure 4: Strawberry plant height and width estimation via DBSCAN and PCA. The red solid curve is the fitted line, and the black dashed curve is the ideal one.