Event-based Visual Deformation Measurement
Yuliang Wu, Wei Zhai, Yuxin Cui, Tiesong Zhao, Yang Cao, Zheng-Jun Zha
TL;DR
This work tackles dense deformation measurement in challenging, dynamic scenes where traditional frame-based VDM struggles due to large inter-frame motion and prohibitive data storage. It introduces an event-frame fusion strategy and an Affine Invariant Simplicial (AIS) framework that linearizes the deformation field into low-parameter, locally affine sub-regions, combined with a neighborhood-greedy optimization to suppress long-term error accumulation. A new benchmark with temporally aligned event and frame data across 120+ sequences demonstrates robust performance, achieving a survival rate of 65.7% for large displacements and substantially lower data storage than high-speed video methods. Overall, the approach enables accurate, storage-efficient dense deformation tracking with potential impact on structural health monitoring, robotics, and biomechanics.
Abstract
Visual Deformation Measurement (VDM) aims to recover dense deformation fields by tracking surface motion from camera observations. Traditional image-based methods rely on minimal inter-frame motion to constrain the correspondence search space, which limits their applicability to highly dynamic scenes or necessitates high-speed cameras at the cost of prohibitive storage and computational overhead. We propose an event-frame fusion framework that exploits events for temporally dense motion cues and frames for spatially dense precise estimation. Revisiting the solid elastic modeling prior, we propose an Affine Invariant Simplicial (AIS) framework. It partitions the deformation field into linearized sub-regions with low-parametric representation, effectively mitigating motion ambiguities arising from sparse and noisy events. To speed up parameter searching and reduce error accumulation, a neighborhood-greedy optimization strategy is introduced, enabling well-converged sub-regions to guide their poorly-converged neighbors, effectively suppress local error accumulation in long-term dense tracking. To evaluate the proposed method, a benchmark dataset with temporally aligned event streams and frames is established, encompassing over 120 sequences spanning diverse deformation scenarios. Experimental results show that our method outperforms the state-of-the-art baseline by 1.6% in survival rate. Remarkably, it achieves this using only 18.9% of the data storage and processing resources of high-speed video methods.
