Table of Contents
Fetching ...

Using Cooperative Co-evolutionary Search to Generate Metamorphic Test Cases for Autonomous Driving Systems

Hossein Yousefizadeh, Shenghui Gu, Lionel C. Briand, Ali Nasr

TL;DR

The paper tackles system-level safety testing for autonomous driving systems by formalizing test-case generation as a search over scenario-perturbation pairs that violate metamorphic relations. It introduces CoCoMEGA, a cooperative co-evolutionary framework that evolves two populations—source scenarios and metamorphic perturbations—guided by a DTW-aligned fitness measure and diversity-preserving archiving. Empirical results on CARLA with InterFuser show that CoCoMEGA outperforms baselines in both the number and diversity of MR-violating test cases and does so with greater efficiency across budgets and thresholds; expert assessments corroborate the safety relevance of the discovered violations. The work demonstrates a scalable, effective approach to uncover subtle, unsafe ADS behaviors and suggests future work on extending to additional simulators and incorporating surrogate models to further boost efficiency.

Abstract

Autonomous Driving Systems (ADSs) rely on Deep Neural Networks, allowing vehicles to navigate complex, open environments. However, the unpredictability of these scenarios highlights the need for rigorous system-level testing to ensure safety, a task usually performed with a simulator in the loop. Though one important goal of such testing is to detect safety violations, there are many undesirable system behaviors, that may not immediately lead to violations, that testing should also be focusing on, thus detecting more subtle problems and enabling a finer-grained analysis. This paper introduces Cooperative Co-evolutionary MEtamorphic test Generator for Autonomous systems (CoCoMEGA), a novel automated testing framework aimed at advancing system-level safety assessments of ADSs. CoCoMEGA combines Metamorphic Testing (MT) with a search-based approach utilizing Cooperative Co-Evolutionary Algorithms (CCEA) to efficiently generate a diverse set of test cases. CoCoMEGA emphasizes the identification of test scenarios that present undesirable system behavior, that may eventually lead to safety violations, captured by Metamorphic Relations (MRs). When evaluated within the CARLA simulation environment on the Interfuser ADS, CoCoMEGA consistently outperforms baseline methods, demonstrating enhanced effectiveness and efficiency in generating severe, diverse MR violations and achieving broader exploration of the test space. These results underscore CoCoMEGA as a promising, more scalable solution to the inherent challenges in ADS testing with a simulator in the loop. Future research directions may include extending the approach to additional simulation platforms, applying it to other complex systems, and exploring methods for further improving testing efficiency such as surrogate modeling.

Using Cooperative Co-evolutionary Search to Generate Metamorphic Test Cases for Autonomous Driving Systems

TL;DR

The paper tackles system-level safety testing for autonomous driving systems by formalizing test-case generation as a search over scenario-perturbation pairs that violate metamorphic relations. It introduces CoCoMEGA, a cooperative co-evolutionary framework that evolves two populations—source scenarios and metamorphic perturbations—guided by a DTW-aligned fitness measure and diversity-preserving archiving. Empirical results on CARLA with InterFuser show that CoCoMEGA outperforms baselines in both the number and diversity of MR-violating test cases and does so with greater efficiency across budgets and thresholds; expert assessments corroborate the safety relevance of the discovered violations. The work demonstrates a scalable, effective approach to uncover subtle, unsafe ADS behaviors and suggests future work on extending to additional simulators and incorporating surrogate models to further boost efficiency.

Abstract

Autonomous Driving Systems (ADSs) rely on Deep Neural Networks, allowing vehicles to navigate complex, open environments. However, the unpredictability of these scenarios highlights the need for rigorous system-level testing to ensure safety, a task usually performed with a simulator in the loop. Though one important goal of such testing is to detect safety violations, there are many undesirable system behaviors, that may not immediately lead to violations, that testing should also be focusing on, thus detecting more subtle problems and enabling a finer-grained analysis. This paper introduces Cooperative Co-evolutionary MEtamorphic test Generator for Autonomous systems (CoCoMEGA), a novel automated testing framework aimed at advancing system-level safety assessments of ADSs. CoCoMEGA combines Metamorphic Testing (MT) with a search-based approach utilizing Cooperative Co-Evolutionary Algorithms (CCEA) to efficiently generate a diverse set of test cases. CoCoMEGA emphasizes the identification of test scenarios that present undesirable system behavior, that may eventually lead to safety violations, captured by Metamorphic Relations (MRs). When evaluated within the CARLA simulation environment on the Interfuser ADS, CoCoMEGA consistently outperforms baseline methods, demonstrating enhanced effectiveness and efficiency in generating severe, diverse MR violations and achieving broader exploration of the test space. These results underscore CoCoMEGA as a promising, more scalable solution to the inherent challenges in ADS testing with a simulator in the loop. Future research directions may include extending the approach to additional simulation platforms, applying it to other complex systems, and exploring methods for further improving testing efficiency such as surrogate modeling.

Paper Structure

This paper contains 47 sections, 14 equations, 16 figures, 4 tables, 5 algorithms.

Figures (16)

  • Figure 1: The scenario domain model for the case study.
  • Figure 2: Overview of .
  • Figure 3: ds vs. distance threshold ($\theta_d$) across different fitness thresholds ($\theta_f$). A higher indicates more distinct violating solutions discovered by the method. Each curve plots the average at different $\theta_d$ settings, under a specific fitness threshold $\theta_f$. This figure reveals how each method balances the quantity and diversity of solutions as $\theta_f$ varies.
  • Figure 4: apd vs. fitness threshold ($\theta_f$) across different distance thresholds ($\theta_d$). quantifies how spread out the solutions are. A higher indicates greater diversity among the found solutions. Each subplot corresponds to a certain distance threshold $\theta_d$, and the y-axis shows the of discovered solutions under various $\theta_f$ values. This figure demonstrates how each method maintains diversity while striving for higher fitness (i.e., more severe violations).
  • Figure 5: Evolution of apd across generations. Higher values imply that solutions are more spread out in the search space, indicating broad exploration. Subplots differ by the chosen fitness threshold ($\theta_f$) and distance threshold ($\theta_d$). Each curve tracks the for a specific method over several generations of search. This figure illustrates how each method maintains or improves solution diversity over generations.
  • ...and 11 more figures