Measure Anything: Real-time, Multi-stage Vision-based Dimensional Measurement using Segment Anything
Yongkyu Lee, Shivam Kumar Panda, Wei Wang, Mohammad Khalid Jawed
TL;DR
This work tackles vision-based dimensional measurement by integrating the Segment Anything Model into a multi-stage pipeline that yields diameter, length, and volume estimates for objects with circular cross-sections. By combining SAM-based segmentation (manual or automated prompts), refined mask processing, geometry-aware skeleton construction, and 2D-3D transform, Measure Anything enables real-time, automated measurements and supports robotic grasping applications. Key contributions include a robust, modular pipeline, validation on Canola stems under field conditions, and demonstration of automated prompting via a keypoint detector to scale high-throughput measurement. The approach bridges segmentation, depth-aware 3D reconstruction, and actionable geometric features, offering practical impact for precision agriculture and autonomous manipulation while outlining clear pathways for handling occlusions and non-circular cross-sections in future work.
Abstract
We present Measure Anything, a comprehensive vision-based framework for dimensional measurement of objects with circular cross-sections, leveraging the Segment Anything Model (SAM). Our approach estimates key geometric features -- including diameter, length, and volume -- for rod-like geometries with varying curvature and general objects with constant skeleton slope. The framework integrates segmentation, mask processing, skeleton construction, and 2D-3D transformation, packaged in a user-friendly interface. We validate our framework by estimating the diameters of Canola stems -- collected from agricultural fields in North Dakota -- which are thin and non-uniform, posing challenges for existing methods. Measuring its diameters is critical, as it is a phenotypic traits that correlates with the health and yield of Canola crops. This application also exemplifies the potential of Measure Anything, where integrating intelligent models -- such as keypoint detection -- extends its scalability to fully automate the measurement process for high-throughput applications. Furthermore, we showcase its versatility in robotic grasping, leveraging extracted geometric features to identify optimal grasp points.
