Table of Contents
Fetching ...

Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments

Benoît Gérin, Anaïs Halin, Anthony Cioppa, Maxim Henry, Bernard Ghanem, Benoît Macq, Christophe De Vleeschouwer, Marc Van Droogenbroeck

TL;DR

The paper tackles the challenge of real-time adaptation for lightweight perception models deployed on fleets of autonomous agents operating in dynamic environments. It introduces Multi-Stream Cellular Test-Time Adaptation (MSC-TTA), a framework that partitions the environment into cells and uses a fast on-board inference path together with a slow cloud-backed training path guided by per-cell teacher models and replay buffers. Through an adaptive student–teacher mechanism, cells aggregate multi-stream data to train specialized student models that are broadcast back to agents, with instantaneous switches when cells change. A new large-scale synthetic dataset, DADE, based on CARLA, supports benchmarking of semantic segmentation under diverse locations and weather conditions. Experimental results on DADE show that MSC-TTA outperforms single-stream baselines and ARTHuS-like approaches, demonstrating improved robustness and real-time adaptability for IoT/5G-enabled autonomous driving applications.

Abstract

In the era of the Internet of Things (IoT), objects connect through a dynamic network, empowered by technologies like 5G, enabling real-time data sharing. However, smart objects, notably autonomous vehicles, face challenges in critical local computations due to limited resources. Lightweight AI models offer a solution but struggle with diverse data distributions. To address this limitation, we propose a novel Multi-Stream Cellular Test-Time Adaptation (MSC-TTA) setup where models adapt on the fly to a dynamic environment divided into cells. Then, we propose a real-time adaptive student-teacher method that leverages the multiple streams available in each cell to quickly adapt to changing data distributions. We validate our methodology in the context of autonomous vehicles navigating across cells defined based on location and weather conditions. To facilitate future benchmarking, we release a new multi-stream large-scale synthetic semantic segmentation dataset, called DADE, and show that our multi-stream approach outperforms a single-stream baseline. We believe that our work will open research opportunities in the IoT and 5G eras, offering solutions for real-time model adaptation.

Multi-Stream Cellular Test-Time Adaptation of Real-Time Models Evolving in Dynamic Environments

TL;DR

The paper tackles the challenge of real-time adaptation for lightweight perception models deployed on fleets of autonomous agents operating in dynamic environments. It introduces Multi-Stream Cellular Test-Time Adaptation (MSC-TTA), a framework that partitions the environment into cells and uses a fast on-board inference path together with a slow cloud-backed training path guided by per-cell teacher models and replay buffers. Through an adaptive student–teacher mechanism, cells aggregate multi-stream data to train specialized student models that are broadcast back to agents, with instantaneous switches when cells change. A new large-scale synthetic dataset, DADE, based on CARLA, supports benchmarking of semantic segmentation under diverse locations and weather conditions. Experimental results on DADE show that MSC-TTA outperforms single-stream baselines and ARTHuS-like approaches, demonstrating improved robustness and real-time adaptability for IoT/5G-enabled autonomous driving applications.

Abstract

In the era of the Internet of Things (IoT), objects connect through a dynamic network, empowered by technologies like 5G, enabling real-time data sharing. However, smart objects, notably autonomous vehicles, face challenges in critical local computations due to limited resources. Lightweight AI models offer a solution but struggle with diverse data distributions. To address this limitation, we propose a novel Multi-Stream Cellular Test-Time Adaptation (MSC-TTA) setup where models adapt on the fly to a dynamic environment divided into cells. Then, we propose a real-time adaptive student-teacher method that leverages the multiple streams available in each cell to quickly adapt to changing data distributions. We validate our methodology in the context of autonomous vehicles navigating across cells defined based on location and weather conditions. To facilitate future benchmarking, we release a new multi-stream large-scale synthetic semantic segmentation dataset, called DADE, and show that our multi-stream approach outperforms a single-stream baseline. We believe that our work will open research opportunities in the IoT and 5G eras, offering solutions for real-time model adaptation.
Paper Structure (25 sections, 1 equation, 15 figures, 6 tables)

This paper contains 25 sections, 1 equation, 15 figures, 6 tables.

Figures (15)

  • Figure 1: Multi-Stream Cellular Test-time Adaptation (MSC-TTA) of real-time models. We consider a set of agents (e.g., autonomous vehicles) evolving in a dynamic environment divided into cells (e.g., city center or suburb) that perform the same task (e.g., semantic segmentation) in real time on their own unlabeled data stream (e.g., recorded images) using an on-board model. We propose a first method in which agents share part of their data stream through an IoT network (e.g., a connection to a 5G tower). Cell-based lightweight models are then trained on the fly (in our case through an adaptive student-teacher method) and their weights are regularly broadcasted to the agents to improve their performance over time. When agents transitions between cells, the agent's model is immediately switched to the one of the new cell, effectively adapting the predictions of the transiting agent.
  • Figure 2: Pipeline of our multi-stream cellular test-time adaptation of real-time models. Our method is composed of a fast route for inference and a slow route for online training, as defined in Cioppa2019ARTHuSHouyon2023Online. In the fast route, each agent $a_{n}$ processes a stream of data samples $x_{a_{n}}^{t}$ and predicts labels $\hat{y}_{a_{n}}^{t}=f_{a_{n}}^{t}(x_{a_{n}}^{t})$ in real time (i.e., at the data stream rate $r_{\mathcal{X}_{}^{}}$). Agents located within a cell ${c}^{}$ send a subset of their data samples at a slower rate $r_{\mathcal{T}_{}^{}})$ to a slow route operating on a remote server (e.g., on the cloud) dedicated for each cell. In the slow route, a teacher model $\mathcal{T}_{{c}^{}}^{t'}$ predicts pseudo labels on the received data and stores them in a replay buffer $\mathcal{R}_{{c}^{}}^{t'}$. The replay buffer is then used to train on the fly a cell-specific student model $\mathcal{S}_{{c}^{}}^{t'}$ at a rate $r_{\mathcal{S}_{}^{}}$. After each training epoch on the replay buffer, the parameters of $\mathcal{S}_{{c}^{}}^{}$ are transferred to all agent models $f_{a_{n}}^{}$ located within that cell. Finally, agents transiting between two cells have their model switched instantly.
  • Figure 3: Images of the different locations in our dataset. We define $7$ different locations that are defined based on the GNSS data. From left to right: forest, countryside, rural farmland, highway, low density residential, community buildings, and high density residential.
  • Figure 4: Evolution of the fleet performance over time on DADE-static weather (top) and DADE-dynamic weather (bottom). Comparison of the performance in the MSC-OL setup (left) and the MSC-TTA (right) setup of the best adaptive settings along with the baseline for each pretraining (Scratch, General, and Cell).
  • Figure 5: Qualitative results. Comparison of different segmentation masks. From left to right: RGB image, ground truth, Baseline, Common scenario with General pretraining, and Spatial scenario with Cell pretraining. Black areas correspond to non-evaluated classes.
  • ...and 10 more figures