MultiSurf-GPT: Facilitating Context-Aware Reasoning with Large-Scale Language Models for Multimodal Surface Sensing
Yongquan Hu, Black Sun, Pengcheng An, Zhuying Li, Wen Hu, Aaron J. Quigley
TL;DR
This work tackles the need for unified processing of multimodal surface sensing data to enable context-aware mobile computing. It introduces MultiSurf-GPT, a GPT-4o-based framework that treats radar, microscope, and multispectral inputs within a single prompting-driven pipeline to perform both low-level recognition and high-level contextual reasoning. Through experiments on Tangible Radar, MicroCam, and SpeCam datasets, the approach demonstrates high radar task accuracy and notable improvements in image-based analyses when using one-shot prompts, while also showing enhanced context-aware interpretation compared to baseline GPT-4o. The study highlights the potential of multimodal LLMs for rapid prototyping and integrated mobile sensing applications, while identifying limitations and outlining future directions such as instruction tuning and broader user studies to advance practical deployment.
Abstract
Surface sensing is widely employed in health diagnostics, manufacturing and safety monitoring. Advances in mobile sensing affords this potential for context awareness in mobile computing, typically with a single sensing modality. Emerging multimodal large-scale language models offer new opportunities. We propose MultiSurf-GPT, which utilizes the advanced capabilities of GPT-4o to process and interpret diverse modalities (radar, microscope and multispectral data) uniformly based on prompting strategies (zero-shot and few-shot prompting). We preliminarily validated our framework by using MultiSurf-GPT to identify low-level information, and to infer high-level context-aware analytics, demonstrating the capability of augmenting context-aware insights. This framework shows promise as a tool to expedite the development of more complex context-aware applications in the future, providing a faster, more cost-effective, and integrated solution.
