Table of Contents
Fetching ...

What Did I Learn? Operational Competence Assessment for AI-Based Trajectory Planners

Michiel Braat, Maren Buermann, Marijke van Weperen, Jan-Pieter Paardekooper

TL;DR

The paper tackles the challenge of ensuring trustworthy AI for automated driving by estimating when an AI-based trajectory planner is operating in contexts it has not been adequately trained for. It introduces a knowledge-graph based framework to describe driving data, compute a scene completeness metric (coverage) and a scene difficulty metric (complexity), and combines them into a competence score: $Competence(s) = Coverage(s) \cdot (1 - Complexity(s))$. Using the NuPlan dataset, it constructs scene graphs, defines sub-scene patterns, and evaluates how coverage and complexity relate to planner performance, finding a meaningful but not strong alignment between competence and trajectory quality. The approach yields insight into dataset composition and provides a practical, explainable way to gauge operational risk in ML-driven AD systems, with potential to guide data curation and model deployment. Overall, the work contributes a principled, KG-based method to monitor competence, describe driving data, and anticipate when a trajectory planner’s output is trustworthy for a given context.

Abstract

Automated driving functions increasingly rely on machine learning for tasks like perception and trajectory planning, requiring large, relevant datasets. The performance of these algorithms depends on how closely the training data matches the task. To ensure reliable functioning, it is crucial to know what is included in the dataset to assess the trained model's operational risk. We aim to enhance the safe use of machine learning in automated driving by developing a method to recognize situations that an automated vehicle has not been sufficiently trained on. This method also improves explainability by describing the dataset at a human-understandable level. We propose modeling driving data as knowledge graphs, representing driving scenes with entities and their relationships. These graphs are queried for specific sub-scene configurations to check their occurrence in the dataset. We estimate a vehicle's competence in a driving scene by considering the coverage and complexity of sub-scene configurations in the training set. Higher complexity scenes require greater coverage for high competence. We apply this method to the NuPlan dataset, modeling it with knowledge graphs and analyzing the coverage of specific driving scenes. This approach helps monitor the competence of machine learning models trained on the dataset, which is essential for trustworthy AI to be deployed in automated driving.

What Did I Learn? Operational Competence Assessment for AI-Based Trajectory Planners

TL;DR

The paper tackles the challenge of ensuring trustworthy AI for automated driving by estimating when an AI-based trajectory planner is operating in contexts it has not been adequately trained for. It introduces a knowledge-graph based framework to describe driving data, compute a scene completeness metric (coverage) and a scene difficulty metric (complexity), and combines them into a competence score: . Using the NuPlan dataset, it constructs scene graphs, defines sub-scene patterns, and evaluates how coverage and complexity relate to planner performance, finding a meaningful but not strong alignment between competence and trajectory quality. The approach yields insight into dataset composition and provides a practical, explainable way to gauge operational risk in ML-driven AD systems, with potential to guide data curation and model deployment. Overall, the work contributes a principled, KG-based method to monitor competence, describe driving data, and anticipate when a trajectory planner’s output is trustworthy for a given context.

Abstract

Automated driving functions increasingly rely on machine learning for tasks like perception and trajectory planning, requiring large, relevant datasets. The performance of these algorithms depends on how closely the training data matches the task. To ensure reliable functioning, it is crucial to know what is included in the dataset to assess the trained model's operational risk. We aim to enhance the safe use of machine learning in automated driving by developing a method to recognize situations that an automated vehicle has not been sufficiently trained on. This method also improves explainability by describing the dataset at a human-understandable level. We propose modeling driving data as knowledge graphs, representing driving scenes with entities and their relationships. These graphs are queried for specific sub-scene configurations to check their occurrence in the dataset. We estimate a vehicle's competence in a driving scene by considering the coverage and complexity of sub-scene configurations in the training set. Higher complexity scenes require greater coverage for high competence. We apply this method to the NuPlan dataset, modeling it with knowledge graphs and analyzing the coverage of specific driving scenes. This approach helps monitor the competence of machine learning models trained on the dataset, which is essential for trustworthy AI to be deployed in automated driving.

Paper Structure

This paper contains 21 sections, 10 equations, 8 figures, 4 tables.

Figures (8)

  • Figure 1: Example of a scene in the KG representation from the NuPlan dataset.
  • Figure 2: Flowchart illustrating the process of data pre-processing and competence measurement for driving data.
  • Figure 3: Coverage per city sub-scene composition over different values of n: The x-axis represents different values of $n$ (ranging from 0 to 50,000), while the y-axis shows the coverage (ranging from 0.0 to 1.0). The graph illustrates how coverage decreases with increasing $n$ for sub-scenes in Boston and Singapore, including their mean coverage and overall occurrences.
  • Figure 4: Example of a context with no sub-scene matches.
  • Figure 5: Violin plots illustrating the complexity distributions across three metrics ($C_1$, $C_2$, and $C_3$) and the total complexity for Boston and Singapore.
  • ...and 3 more figures