Network-Adaptive Cloud Preprocessing for Visual Neuroprostheses

Jiayi Liu; Yilin Wang; Michael Beyeler

Network-Adaptive Cloud Preprocessing for Visual Neuroprostheses

Jiayi Liu, Yilin Wang, Michael Beyeler

TL;DR

This work addresses the latency challenges of cloud-assisted preprocessing for visual neuroprostheses by introducing a network-adaptive encoding policy that uses real-time RTT feedback to modulate JPEG quality, resolution, and transmission interval. By operating in five discrete regimes, the system prioritizes temporal continuity under adverse network conditions, substantially reducing end-to-end latency with only modest loss of global scene structure and greater degradation of boundary precision. Across multiple network scenarios, adaptive encoding lowers RTT and server inference time, particularly under congestion, while preserving usable perceptual fidelity for mobility tasks. The study provides a principled framework for balancing computation, communication, and neural stimulation timing, outlining operating regimes and design considerations essential for safe and effective cloud-assisted prosthetic vision.

Abstract

Cloud-based machine learning is increasingly explored as a preprocessing strategy for next-generation visual neuroprostheses, where advanced scene understanding may exceed the computational and energy constraints of battery-powered visual processing units (VPUs). Offloading computation to remote servers enables the use of state-of-the-art vision models, but also introduces sensitivity to network latency, jitter, and packet loss, which can disrupt the temporal consistency of the delivered neural stimulus. In this work, we examine the feasibility of cloud-assisted visual preprocessing for artificial vision by framing remote inference as a perceptually constrained systems problem. We present a network-adaptive cloud-assisted pipeline in which real-time round-trip-time (RTT) feedback is used to dynamically modulate image resolution, compression, and transmission rate, explicitly prioritizing temporal continuity under adverse network conditions. Using a Raspberry Pi 4 as a simulated VPU and a client-server architecture, we evaluate system performance across a range of realistic wireless network regimes. Results show that adaptive visual encoding substantially reduces end-to-end latency during network congestion, with only modest degradation of global scene structure, while boundary precision degrades more sharply. Together, these findings delineate operating regimes in which cloud-assisted preprocessing may remain viable for future visual neuroprostheses and underscore the importance of network-aware adaptation for maintaining perceptual stability.

Network-Adaptive Cloud Preprocessing for Visual Neuroprostheses

TL;DR

Abstract

Paper Structure (21 sections, 2 equations, 3 figures, 3 tables)

This paper contains 21 sections, 2 equations, 3 figures, 3 tables.

INTRODUCTION
METHODS
System Overview and Architecture
Network-Adaptive Visual Encoding Policy
Network Feedback Signal
Tiered Reconfiguration Policy
Remote Semantic Preprocessing
Client--Server Communication
Experimental Setup and Network Impairment Model
Outcome Measures
Temporal Responsiveness
Perceptual Fidelity Measures
RESULTS
Temporal Responsiveness Under Network Impairment
Server-Side Inference Time
...and 6 more sections

Figures (3)

Figure 1: Network-adaptive cloud processing for visual neuroprostheses. Egocentric video captured by a resource-constrained VPU is adaptively encoded prior to transmission based on real-time network feedback. RTT measurements drive a closed-loop controller that modulates image resolution, compression, and transmission interval to maintain temporal continuity of the delivered visual stimulus under network impairments. Remote semantic segmentation is performed in the cloud, and the resulting simplified scene is returned to the client. In the present system, reduced semantic detail under network congestion arises indirectly from input degradation; future implementations could achieve similar effects by explicitly requesting coarse vs. fine semantic outputs from the cloud inference service. Scene images were generated using Nano Banana Pro for illustrative purposes only.
Figure 2: End-to-end round-trip time (RTT) distributions under five simulated network conditions, comparing a static baseline with the proposed network-adaptive encoding policy.
Figure 3: Mean server-side inference time under each network condition for static and adaptive configurations.

Network-Adaptive Cloud Preprocessing for Visual Neuroprostheses

TL;DR

Abstract

Network-Adaptive Cloud Preprocessing for Visual Neuroprostheses

Authors

TL;DR

Abstract

Table of Contents

Figures (3)