Table of Contents
Fetching ...

CODEI: Resource-Efficient Task-Driven Co-Design of Perception and Decision Making for Mobile Robots Applied to Autonomous Vehicles

Dejan Milojevic, Gioele Zardini, Miriam Elser, Andrea Censi, Emilio Frazzoli

TL;DR

CODEI tackles the problem of resource-efficient, task-driven co-design for mobile robots by linking perception requirements to motion planning through occupancy queries. It introduces a monotone co-design framework and a practical pipeline that converts task queries into perception requirements (PR) via collision and predictive mappings, then solves sensor placement as a weighted set-cover problem integrated with an ILP-based outer optimization. The key contributions include the occupancy-query concept, PR/pcp/priorcheck formalism, and a two-tier optimization (inner sensor selection/placement and outer body/compute/perception trade-offs) demonstrated on an urban autonomous-vehicle case study showing how sensor choices depend on resource priorities. The results provide design guidelines showing cameras tend to minimize weight and cost, while lidar offers better perception performance under higher resource demands, with complex tasks always needing lidar. Overall, CODEI provides a computable, scalable framework to design AVs and mobile robots that balance safety, performance, and resource constraints in realistic scenarios.

Abstract

This paper discusses the integration challenges and strategies for designing mobile robots, by focusing on the task-driven, optimal selection of hardware and software to balance safety, efficiency, and minimal usage of resources such as costs, energy, computational requirements, and weight. We emphasize the interplay between perception and motion planning in decision-making by introducing the concept of occupancy queries to quantify the perception requirements for sampling-based motion planners. Sensor and algorithm performance are evaluated using False Negative Rates (FPR) and False Positive Rates (FPR) across various factors such as geometric relationships, object properties, sensor resolution, and environmental conditions. By integrating perception requirements with perception performance, an Integer Linear Programming (ILP) approach is proposed for efficient sensor and algorithm selection and placement. This forms the basis for a co-design optimization that includes the robot body, motion planner, perception pipeline, and computing unit. We refer to this framework for solving the co-design problem of mobile robots as CODEI, short for Co-design of Embodied Intelligence. A case study on developing an Autonomous Vehicle (AV) for urban scenarios provides actionable information for designers, and shows that complex tasks escalate resource demands, with task performance affecting choices of the autonomy stack. The study demonstrates that resource prioritization influences sensor choice: cameras are preferred for cost-effective and lightweight designs, while lidar sensors are chosen for better energy and computational efficiency.

CODEI: Resource-Efficient Task-Driven Co-Design of Perception and Decision Making for Mobile Robots Applied to Autonomous Vehicles

TL;DR

CODEI tackles the problem of resource-efficient, task-driven co-design for mobile robots by linking perception requirements to motion planning through occupancy queries. It introduces a monotone co-design framework and a practical pipeline that converts task queries into perception requirements (PR) via collision and predictive mappings, then solves sensor placement as a weighted set-cover problem integrated with an ILP-based outer optimization. The key contributions include the occupancy-query concept, PR/pcp/priorcheck formalism, and a two-tier optimization (inner sensor selection/placement and outer body/compute/perception trade-offs) demonstrated on an urban autonomous-vehicle case study showing how sensor choices depend on resource priorities. The results provide design guidelines showing cameras tend to minimize weight and cost, while lidar offers better perception performance under higher resource demands, with complex tasks always needing lidar. Overall, CODEI provides a computable, scalable framework to design AVs and mobile robots that balance safety, performance, and resource constraints in realistic scenarios.

Abstract

This paper discusses the integration challenges and strategies for designing mobile robots, by focusing on the task-driven, optimal selection of hardware and software to balance safety, efficiency, and minimal usage of resources such as costs, energy, computational requirements, and weight. We emphasize the interplay between perception and motion planning in decision-making by introducing the concept of occupancy queries to quantify the perception requirements for sampling-based motion planners. Sensor and algorithm performance are evaluated using False Negative Rates (FPR) and False Positive Rates (FPR) across various factors such as geometric relationships, object properties, sensor resolution, and environmental conditions. By integrating perception requirements with perception performance, an Integer Linear Programming (ILP) approach is proposed for efficient sensor and algorithm selection and placement. This forms the basis for a co-design optimization that includes the robot body, motion planner, perception pipeline, and computing unit. We refer to this framework for solving the co-design problem of mobile robots as CODEI, short for Co-design of Embodied Intelligence. A case study on developing an Autonomous Vehicle (AV) for urban scenarios provides actionable information for designers, and shows that complex tasks escalate resource demands, with task performance affecting choices of the autonomy stack. The study demonstrates that resource prioritization influences sensor choice: cameras are preferred for cost-effective and lightweight designs, while lidar sensors are chosen for better energy and computational efficiency.

Paper Structure

This paper contains 21 sections, 5 theorems, 18 equations, 33 figures, 1 table.

Key Result

Lemma 23

The task occupancy queries$\mathrm{tq}$ is monotone in the task, as shown in fig:mdpi_planner.

Figures (33)

  • Figure 1: Graphical illustration of the informal problem definition for designing an for urban driving tasks, based on a catalog of hardware and software components with an emphasis on minimizing resources.
  • Figure 2: An illustration of an navigating towards the yellow target area. The figure showcases two motion planners: an RRT*-based planner and a lattice planner. The red lines represent the tree of paths generated by each planner, while the green line indicates the solution path identified by the planner.
  • Figure 3: This figure shows class configurations at time 0 leading to potential collisions with a robot at a specific query $\psi=\langle q_{0}^{\mathcal{R}}, \tau, \mathrm{env}\rangle$. The robot is depicted as a small red on the left and the robot's future configuration $q_{0}^{\mathcal{R}}$ from the query is the transparent in the intersection's center. Surrounding cars represent classes with trajectories that lead to a collision with the at time $\tau$ with configuration $q_{0}^{\mathcal{R}}$. Green lines show feasible trajectories based on prior knowledge, and a red line shows an infeasible trajectory that violates the prior. The perception requirements in this example are the depicted car configurations $q_{\text{suv}}^{\mathcal{R}}$ and $q_{\text{bus}}^{\mathcal{R}}$ with green trajectories.
  • Figure 4: Comparison of the perception performance of two pipelines: Velodyne HDL32E lidar with PointPillars detection model (top plots) Lang2020PointPillars:Clouds and Basler acA1600-60gc camera with FCOS3D detection model (bottom plots) wang2021fcos3d. Left plots show FNRs and right plots FPRs, highlighting the upper bounds of confidence intervals against radial distance $r$ and relative orientation $\theta$ between sensor and object class in polar coordinates. Data is from the nuScenes dataset caesar2020nuscenes, using models from the MMDetection3D mmdet3d2020 library.
  • Figure 5: Graphical representation of the simplified data flow for the entire benchmarking process, from sensor measurements to the calculation of and (sensor data taken from caesar2020nuscenes, model architecture image from Lang2020PointPillars:Clouds).
  • ...and 28 more figures

Theorems & Definitions (40)

  • Definition 1: Body
  • Remark 2
  • Definition 3: Robot
  • Definition 4: Object class
  • Remark 5
  • Remark 6
  • Definition 7: Scenario
  • Remark 8
  • Definition 9: Task
  • Definition 10: Query
  • ...and 30 more