Table of Contents
Fetching ...

R2SNet: Scalable Domain Adaptation for Object Detection in Cloud-Based Robotic Ecosystems via Proposal Refinement

Michele Antonazzi, Matteo Luperto, N. Alberto Borghese, Nicola Basilico

TL;DR

This work tackles the problem of domain shift in cloud-based object detection for multi-robot systems by introducing R2SNet, a lightweight downstream refinement network that runs on robots to relabel, rescore, and suppress cloud-generated proposals before final post-processing. The system combines BD and ID features through two symmetric MLP-based branches, guided by a BFNet that extracts image descriptors from a grid-mapped feature representation, enabling robust, environment-specific adaptation without retraining cloud models. Empirical results on door-detection tasks show significant mAP gains with modest data and feasible edge-device latency (up to 16.7 Hz on GPU and 2.6 Hz on CPU on a Jetson TX2), demonstrating scalable domain adaptation for cloud robotics. The approach reduces dependence on cloud re-training, preserves privacy, and can generalize to other perception tasks beyond door detection, highlighting practical impact for real-time robotic perception in cloud-enabled ecosystems.

Abstract

We introduce a novel approach for scalable domain adaptation in cloud robotics scenarios where robots rely on third-party AI inference services powered by large pre-trained deep neural networks. Our method is based on a downstream proposal-refinement stage running locally on the robots, exploiting a new lightweight DNN architecture, R2SNet. This architecture aims to mitigate performance degradation from domain shifts by adapting the object detection process to the target environment, focusing on relabeling, rescoring, and suppression of bounding-box proposals. Our method allows for local execution on robots, addressing the scalability challenges of domain adaptation without incurring significant computational costs. Real-world results on mobile service robots performing door detection show the effectiveness of the proposed method in achieving scalable domain adaptation.

R2SNet: Scalable Domain Adaptation for Object Detection in Cloud-Based Robotic Ecosystems via Proposal Refinement

TL;DR

This work tackles the problem of domain shift in cloud-based object detection for multi-robot systems by introducing R2SNet, a lightweight downstream refinement network that runs on robots to relabel, rescore, and suppress cloud-generated proposals before final post-processing. The system combines BD and ID features through two symmetric MLP-based branches, guided by a BFNet that extracts image descriptors from a grid-mapped feature representation, enabling robust, environment-specific adaptation without retraining cloud models. Empirical results on door-detection tasks show significant mAP gains with modest data and feasible edge-device latency (up to 16.7 Hz on GPU and 2.6 Hz on CPU on a Jetson TX2), demonstrating scalable domain adaptation for cloud robotics. The approach reduces dependence on cloud re-training, preserves privacy, and can generalize to other perception tasks beyond door detection, highlighting practical impact for real-time robotic perception in cloud-enabled ecosystems.

Abstract

We introduce a novel approach for scalable domain adaptation in cloud robotics scenarios where robots rely on third-party AI inference services powered by large pre-trained deep neural networks. Our method is based on a downstream proposal-refinement stage running locally on the robots, exploiting a new lightweight DNN architecture, R2SNet. This architecture aims to mitigate performance degradation from domain shifts by adapting the object detection process to the target environment, focusing on relabeling, rescoring, and suppression of bounding-box proposals. Our method allows for local execution on robots, addressing the scalability challenges of domain adaptation without incurring significant computational costs. Real-world results on mobile service robots performing door detection show the effectiveness of the proposed method in achieving scalable domain adaptation.
Paper Structure (10 sections, 8 equations, 5 figures, 2 tables)

This paper contains 10 sections, 8 equations, 5 figures, 2 tables.

Figures (5)

  • Figure 1: A general overview of the cloud--based scenario we consider.
  • Figure 2: R2SNet refinements in filtering dense proposals, compared to standard post--processing. Green/red bounding boxes are open/closed doors.
  • Figure 3: The R2SNet architecture. Batch normalization and ReLU activation functions are applied to all layers of the shared MLPs.
  • Figure 4: The BFNet architecture.
  • Figure 5: $\text{R2S}_{75}^k$ performance when varying $k$ expressed with the mAP (top row) and the additional indicators (bottom row).