A Case for Application-Aware Space Radiation Tolerance in Orbital Computing
Meiqi Wang, Han Qiu, Longnv Xu, Di Wang, Yuanjie Li, Tianwei Zhang, Jun Liu, Hewu Li
TL;DR
This work tackles the reliability of COTS in-orbit computing under space radiation by shifting protection from generic hardware/software redundancy to application-aware strategies for DNN workloads. It introduces RedNet, which uses a bounded LogClip activation and a multi-exit mechanism to suppress error propagation and allow early, confidence-based exits, reducing the impact of SEUs and MCUs on inference. The authors validate RedNet through real-world ground tests on Chaohu-1-like hardware and extensive hardware-in-the-loop emulations, demonstrating up to $\approx 0$ error influence and throughput boosts of $8.4\%$ to $33.0\%$ with negligible memory overhead. The work also provides an open-source radiation emulator and RedNet artifacts, offering a practical path toward application-aware fault tolerance for orbital AI tasks and setting a benchmark for future research in reliable space computing.
Abstract
We are witnessing a surge in the use of commercial off-the-shelf (COTS) hardware for cost-effective in-orbit computing, such as deep neural network (DNN) based on-satellite sensor data processing, Earth object detection, and task decision.However, once exposed to harsh space environments, COTS hardware is vulnerable to cosmic radiation and suffers from exhaustive single-event upsets (SEUs) and multi-unit upsets (MCUs), both threatening the functionality and correctness of in-orbit computing.Existing hardware and system software protections against radiation are expensive for resource-constrained COTS nanosatellites and overwhelming for upper-layer applications due to their requirement for heavy resource redundancy and frequent reboots. Instead, we make a case for cost-effective space radiation tolerance using application domain knowledge. Our solution for the on-satellite DNN tasks, \name, exploits the uneven SEU/MCU sensitivity across DNN layers and MCUs' spatial correlation for lightweight radiation-tolerant in-orbit AI computing. Our extensive experiments using Chaohu-1 SAR satellite payloads and a hardware-in-the-loop, real data-driven space radiation emulator validate that RedNet can suppress the influence of radiation errors to $\approx$ 0 and accelerate the on-satellite DNN inference speed by 8.4%-33.0% at negligible extra costs.
