LQA: A Lightweight Quantized-Adaptive Framework for Vision-Language Models on the Edge
Xin Wang, Hualin Zhou, Sheng Guang Wang, Ting Dang, Yu Zhang, Hong Jia, Tao Gu
TL;DR
Vision-Language Models struggle to run robustly on edge devices due to distribution shifts and limited resources. The authors introduce LQA, a lightweight framework that jointly quantizes the backbone with modality-aware precision and performs gradient-free, cache-based test-time adaptation (Q-TTA) entirely in low precision. Core contributions include Selective Hybrid Quantization (SHQ) with Hessian-aware vision quantization and selective precision retention, plus a fully quantized Q-TTA mechanism that uses positive/negative exemplar caches. Across seven datasets with synthetic and real-world shifts, LQA achieves up to 19.9× memory savings and outperforms gradient-based TTA methods, enabling practical, privacy-preserving VLM deployment on edge devices.
Abstract
Deploying Vision-Language Models (VLMs) on edge devices is challenged by resource constraints and performance degradation under distribution shifts. While test-time adaptation (TTA) can counteract such shifts, existing methods are too resource-intensive for on-device deployment. To address this challenge, we propose LQA, a lightweight, quantized-adaptive framework for VLMs that combines a modality-aware quantization strategy with gradient-free test-time adaptation. We introduce Selective Hybrid Quantization (SHQ) and a quantized, gradient-free adaptation mechanism to enable robust and efficient VLM deployment on resource-constrained hardware. Experiments across both synthetic and real-world distribution shifts show that LQA improves overall adaptation performance by 4.5\%, uses less memory than full-precision models, and significantly outperforms gradient-based TTA methods, achieving up to 19.9$\times$ lower memory usage across seven open-source datasets. These results demonstrate that LQA offers a practical pathway for robust, privacy-preserving, and efficient VLM deployment on edge devices.
