Can Foundation Models Revolutionize Mobile AR Sparse Sensing?
Yiqin Zhao, Tian Guo
TL;DR
This paper tackles the energy–accuracy trade-off in mobile AR by evaluating foundation-model‑driven sparse sensing on real‑world data. It demonstrates that foundation‑model depth estimation markedly improves cross‑frame information reuse via geometry‑aware image warping, yielding about $25.5\%$ RGB SSIM and $30.7\%$ depth SSIM gains over LiDAR depth, and remains robust across larger frame gaps. For long‑duration AR, foundation‑model depth enables substantially better 3D reconstruction under sparse frame inputs, e.g., Hausdorff Distance improves from $0.48$ to $0.25$ with Poisson + ICP, indicating scalable sparse sensing. The work also shows that information overlap evolves nonlinearly and advocates hybrid temporal–spatial sparse‑sensing policies that adapt to user and environment context, enabling mobile AR to sense only when it matters.
Abstract
Mobile sensing systems have long faced a fundamental trade-off between sensing quality and efficiency due to constraints in computation, power, and other limitations. Sparse sensing, which aims to acquire and process only a subset of sensor data, has been a key strategy for maintaining performance under such constraints. However, existing sparse sensing methods often suffer from reduced accuracy, as missing information across space and time introduces uncertainty into many sensing systems. In this work, we investigate whether foundation models can change the landscape of mobile sparse sensing. Using real-world mobile AR data, our evaluations demonstrate that foundation models offer significant improvements in geometry-aware image warping, a central technique for enabling accurate reuse of cross-frame information. Furthermore, our study demonstrates the scalability of foundation model-based sparse sensing and shows its leading performance in 3D scene reconstruction. Collectively, our study reveals critical aspects of the promises and the open challenges of integrating foundation models into mobile sparse sensing systems.
