Table of Contents
Fetching ...

Understanding Particles From Video: Property Estimation of Granular Materials via Visuo-Haptic Learning

Zeqing Zhang, Guangze Zheng, Xuebo Ji, Guanqi Chen, Ruixing Jia, Wentao Chen, Guanhua Chen, Liangjun Zhang, Jia Pan

TL;DR

This work tackles estimating relative granular material properties, specifically particle size and density, from video alone by leveraging a visuo-haptic learning framework inspired by a probe-dragging contact model. An encoder–decoder network, trained with force signals as supervision and aided by a particle-tracking preprocessing step, maps granule motion in videos to latent embeddings that implicitly encode size and density distributions, enabling single-modality inference at inference time. The approach is validated on the GM15-VF dataset and extended to real-world beach sands and handheld video collection, with ablation studies demonstrating the importance of tracking and the interpretability of the latent space. Key findings include robust force inference on unseen GMs, a 4D latent space that captures relative GM properties, and practical generalization to non-lab data, though moisture effects can limit applicability.

Abstract

Granular materials (GMs) are ubiquitous in daily life. Understanding their properties is also important, especially in agriculture and industry. However, existing works require dedicated measurement equipment and also need large human efforts to handle a large number of particles. In this paper, we introduce a method for estimating the relative values of particle size and density from the video of the interaction with GMs. It is trained on a visuo-haptic learning framework inspired by a contact model, which reveals the strong correlation between GM properties and the visual-haptic data during the probe-dragging in the GMs. After training, the network can map the visual modality well to the haptic signal and implicitly characterize the relative distribution of particle properties in its latent embeddings, as interpreted in that contact model. Therefore, we can analyze GM properties using the trained encoder, and only visual information is needed without extra sensory modalities and human efforts for labeling. The presented GM property estimator has been extensively validated via comparison and ablation experiments. The generalization capability has also been evaluated and a real-world application on the beach is also demonstrated. Experiment videos are available at \url{https://sites.google.com/view/gmwork/vhlearning} .

Understanding Particles From Video: Property Estimation of Granular Materials via Visuo-Haptic Learning

TL;DR

This work tackles estimating relative granular material properties, specifically particle size and density, from video alone by leveraging a visuo-haptic learning framework inspired by a probe-dragging contact model. An encoder–decoder network, trained with force signals as supervision and aided by a particle-tracking preprocessing step, maps granule motion in videos to latent embeddings that implicitly encode size and density distributions, enabling single-modality inference at inference time. The approach is validated on the GM15-VF dataset and extended to real-world beach sands and handheld video collection, with ablation studies demonstrating the importance of tracking and the interpretability of the latent space. Key findings include robust force inference on unseen GMs, a 4D latent space that captures relative GM properties, and practical generalization to non-lab data, though moisture effects can limit applicability.

Abstract

Granular materials (GMs) are ubiquitous in daily life. Understanding their properties is also important, especially in agriculture and industry. However, existing works require dedicated measurement equipment and also need large human efforts to handle a large number of particles. In this paper, we introduce a method for estimating the relative values of particle size and density from the video of the interaction with GMs. It is trained on a visuo-haptic learning framework inspired by a contact model, which reveals the strong correlation between GM properties and the visual-haptic data during the probe-dragging in the GMs. After training, the network can map the visual modality well to the haptic signal and implicitly characterize the relative distribution of particle properties in its latent embeddings, as interpreted in that contact model. Therefore, we can analyze GM properties using the trained encoder, and only visual information is needed without extra sensory modalities and human efforts for labeling. The presented GM property estimator has been extensively validated via comparison and ablation experiments. The generalization capability has also been evaluated and a real-world application on the beach is also demonstrated. Experiment videos are available at \url{https://sites.google.com/view/gmwork/vhlearning} .

Paper Structure

This paper contains 20 sections, 1 equation, 10 figures, 2 tables.

Figures (10)

  • Figure 1: Overview of this work. (a) Probe-dragging. A simplified GM-tool contact model is given in the physics community albert1999slow. (b) Visual and haptic data. Force sequence $F_d$ is measured by the F/T sensor, and the granule motion is extracted by the proposed particle tracking algorithm from a video clip. (c) Workflow of our visuo-haptic learning, where the granule properties are analyzed from the latent features after training.
  • Figure 2: Architecture of our visuo-haptic learning framework inspired by the contact model in e.q. eq:third_law. The dataset GM15-VF provides the video $\mathbf{V}$ ($C3, D300, H480, W640$) about the probe-dragging and corresponding force sequence $\mathbf{F}$ ($C 1\times D 405$). After being processed by the proposed particle tracking algorithm, the encoder takes as input the trajectories $\mathbf{P}$ (in $x$ and $y$ coordinates) of $49\times12$ points throughout $155$ frames. After processing the decoder, an inferred force sequence $\hat{\mathbf{F}}$ is generated and subsequently utilized to calculate the MSE loss with the true force value $\mathbf{F}$.
  • Figure 3: $15$ types of GMs compose the dataset GM15-VF, where the unseen particles are displayed with green backgrounds for their IDs.
  • Figure 4: Data collection. (a) Experiment setup. (b) Visual data. The proposed particle tracking algorithm is employed on the video clip captured by the mounted camera. (c) Haptic data. The force $F_d$ exerted on the probe is measured by the mounted F/T sensor. Here we only consider the resultant force on the $x-y$ plane.
  • Figure 5: Force inference. (a) Predicted force sequences from inputting videos, including seen and unseen (green background) materials. (b) Means and standard deviations of MSE of force sequence for each GM.
  • ...and 5 more figures