Enabling Near-realtime Remote Sensing via Satellite-Ground Collaboration of Large Vision-Language Models

Zihan Li; Jiahao Yang; Yuxin Zhang; Zhe Chen; Yue Gao

Enabling Near-realtime Remote Sensing via Satellite-Ground Collaboration of Large Vision-Language Models

Zihan Li, Jiahao Yang, Yuxin Zhang, Zhe Chen, Yue Gao

TL;DR

This work addresses the challenge of delivering near-realtime LVLM-based inference for remote sensing under the constraints of LEO satellites with limited onboard compute and intermittent ground links. It introduces Grace, a satellite-ground collaborative framework that partitions inference between onboard compact LVLMs and ground-based larger LVLMs, connected by a dynamic, multimodal RAG knowledge archive and a confidence-driven task dispatcher. Key contributions include a dynamic satellite archive replacement/priority mechanism, a hierarchical transmission scheme that prioritizes recent queries, and a confidence-based cognitive test to decide offloading, all leading to a latency reduction of 76–95% with maintained accuracy. The approach enables scalable, bandwidth-aware, real-time RS processing and sets a path for future ISL-enabled enhancements and broader deployment of LVLM-based remote sensing analytics.

Abstract

Large vision-language models (LVLMs) have recently demonstrated great potential in remote sensing (RS) tasks (e.g., disaster monitoring) conducted by low Earth orbit (LEO) satellites. However, their deployment in real-world LEO satellite systems remains largely unexplored, hindered by limited onboard computing resources and brief satellite-ground contacts. We propose Grace, a satellite-ground collaborative system designed for near-realtime LVLM inference in RS tasks. Accordingly, we deploy compact LVLM on satellites for realtime inference, but larger ones on ground stations (GSs) to guarantee end-to-end performance. Grace is comprised of two main phases that are asynchronous satellite-GS Retrieval-Augmented Generation (RAG), and a task dispatch algorithm. Firstly, we still the knowledge archive of GS RAG to satellite archive with tailored adaptive update algorithm during limited satellite-ground data exchange period. Secondly, propose a confidence-based test algorithm that either processes the task onboard the satellite or offloads it to the GS. Extensive experiments based on real-world satellite orbital data show that Grace reduces the average latency by 76-95% compared to state-of-the-art methods, without compromising inference accuracy.

Enabling Near-realtime Remote Sensing via Satellite-Ground Collaboration of Large Vision-Language Models

TL;DR

Abstract

Enabling Near-realtime Remote Sensing via Satellite-Ground Collaboration of Large Vision-Language Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (11)