Table of Contents
Fetching ...

Network-Adaptive Cloud Preprocessing for Visual Neuroprostheses

Jiayi Liu, Yilin Wang, Michael Beyeler

TL;DR

This work addresses the latency challenges of cloud-assisted preprocessing for visual neuroprostheses by introducing a network-adaptive encoding policy that uses real-time RTT feedback to modulate JPEG quality, resolution, and transmission interval. By operating in five discrete regimes, the system prioritizes temporal continuity under adverse network conditions, substantially reducing end-to-end latency with only modest loss of global scene structure and greater degradation of boundary precision. Across multiple network scenarios, adaptive encoding lowers RTT and server inference time, particularly under congestion, while preserving usable perceptual fidelity for mobility tasks. The study provides a principled framework for balancing computation, communication, and neural stimulation timing, outlining operating regimes and design considerations essential for safe and effective cloud-assisted prosthetic vision.

Abstract

Cloud-based machine learning is increasingly explored as a preprocessing strategy for next-generation visual neuroprostheses, where advanced scene understanding may exceed the computational and energy constraints of battery-powered visual processing units (VPUs). Offloading computation to remote servers enables the use of state-of-the-art vision models, but also introduces sensitivity to network latency, jitter, and packet loss, which can disrupt the temporal consistency of the delivered neural stimulus. In this work, we examine the feasibility of cloud-assisted visual preprocessing for artificial vision by framing remote inference as a perceptually constrained systems problem. We present a network-adaptive cloud-assisted pipeline in which real-time round-trip-time (RTT) feedback is used to dynamically modulate image resolution, compression, and transmission rate, explicitly prioritizing temporal continuity under adverse network conditions. Using a Raspberry Pi 4 as a simulated VPU and a client-server architecture, we evaluate system performance across a range of realistic wireless network regimes. Results show that adaptive visual encoding substantially reduces end-to-end latency during network congestion, with only modest degradation of global scene structure, while boundary precision degrades more sharply. Together, these findings delineate operating regimes in which cloud-assisted preprocessing may remain viable for future visual neuroprostheses and underscore the importance of network-aware adaptation for maintaining perceptual stability.

Network-Adaptive Cloud Preprocessing for Visual Neuroprostheses

TL;DR

This work addresses the latency challenges of cloud-assisted preprocessing for visual neuroprostheses by introducing a network-adaptive encoding policy that uses real-time RTT feedback to modulate JPEG quality, resolution, and transmission interval. By operating in five discrete regimes, the system prioritizes temporal continuity under adverse network conditions, substantially reducing end-to-end latency with only modest loss of global scene structure and greater degradation of boundary precision. Across multiple network scenarios, adaptive encoding lowers RTT and server inference time, particularly under congestion, while preserving usable perceptual fidelity for mobility tasks. The study provides a principled framework for balancing computation, communication, and neural stimulation timing, outlining operating regimes and design considerations essential for safe and effective cloud-assisted prosthetic vision.

Abstract

Cloud-based machine learning is increasingly explored as a preprocessing strategy for next-generation visual neuroprostheses, where advanced scene understanding may exceed the computational and energy constraints of battery-powered visual processing units (VPUs). Offloading computation to remote servers enables the use of state-of-the-art vision models, but also introduces sensitivity to network latency, jitter, and packet loss, which can disrupt the temporal consistency of the delivered neural stimulus. In this work, we examine the feasibility of cloud-assisted visual preprocessing for artificial vision by framing remote inference as a perceptually constrained systems problem. We present a network-adaptive cloud-assisted pipeline in which real-time round-trip-time (RTT) feedback is used to dynamically modulate image resolution, compression, and transmission rate, explicitly prioritizing temporal continuity under adverse network conditions. Using a Raspberry Pi 4 as a simulated VPU and a client-server architecture, we evaluate system performance across a range of realistic wireless network regimes. Results show that adaptive visual encoding substantially reduces end-to-end latency during network congestion, with only modest degradation of global scene structure, while boundary precision degrades more sharply. Together, these findings delineate operating regimes in which cloud-assisted preprocessing may remain viable for future visual neuroprostheses and underscore the importance of network-aware adaptation for maintaining perceptual stability.
Paper Structure (21 sections, 2 equations, 3 figures, 3 tables)

This paper contains 21 sections, 2 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Network-adaptive cloud processing for visual neuroprostheses. Egocentric video captured by a resource-constrained VPU is adaptively encoded prior to transmission based on real-time network feedback. RTT measurements drive a closed-loop controller that modulates image resolution, compression, and transmission interval to maintain temporal continuity of the delivered visual stimulus under network impairments. Remote semantic segmentation is performed in the cloud, and the resulting simplified scene is returned to the client. In the present system, reduced semantic detail under network congestion arises indirectly from input degradation; future implementations could achieve similar effects by explicitly requesting coarse vs. fine semantic outputs from the cloud inference service. Scene images were generated using Nano Banana Pro for illustrative purposes only.
  • Figure 2: End-to-end round-trip time (RTT) distributions under five simulated network conditions, comparing a static baseline with the proposed network-adaptive encoding policy.
  • Figure 3: Mean server-side inference time under each network condition for static and adaptive configurations.