Table of Contents
Fetching ...

ODGR: Online Dynamic Goal Recognition

Matan Shamir, Osher Elhadad, Matthew E. Taylor, Reuth Mirsky

TL;DR

ODGR addresses goal recognition in dynamic, real-time settings by generalizing GR to online inputs and changing goals (e.g., $G^i$, $O^i$). It proposes a transfer-learning-based framework (GATLing) that reuses base policies $\Pi_{G_b}$ to form dynamic policies $\Pi_{G_d}$ for new goals, enabling fast inference without relearning for each goal. The paper formalizes ODGR, contrasts it with Online Static GR (OSGR), and demonstrates feasibility in a 2D navigational domain, showing substantial runtime advantages and robust recognition under partial observations. This work paves the way for adaptable, scalable GR systems in dynamic environments.

Abstract

Traditionally, Reinforcement Learning (RL) problems are aimed at optimization of the behavior of an agent. This paper proposes a novel take on RL, which is used to learn the policy of another agent, to allow real-time recognition of that agent's goals. Goal Recognition (GR) has traditionally been framed as a planning problem where one must recognize an agent's objectives based on its observed actions. Recent approaches have shown how reinforcement learning can be used as part of the GR pipeline, but are limited to recognizing predefined goals and lack scalability in domains with a large goal space. This paper formulates a novel problem, "Online Dynamic Goal Recognition" (ODGR), as a first step to address these limitations. Contributions include introducing the concept of dynamic goals into the standard GR problem definition, revisiting common approaches by reformulating them using ODGR, and demonstrating the feasibility of solving ODGR in a navigation domain using transfer learning. These novel formulations open the door for future extensions of existing transfer learning-based GR methods, which will be robust to changing and expansive real-time environments.

ODGR: Online Dynamic Goal Recognition

TL;DR

ODGR addresses goal recognition in dynamic, real-time settings by generalizing GR to online inputs and changing goals (e.g., , ). It proposes a transfer-learning-based framework (GATLing) that reuses base policies to form dynamic policies for new goals, enabling fast inference without relearning for each goal. The paper formalizes ODGR, contrasts it with Online Static GR (OSGR), and demonstrates feasibility in a 2D navigational domain, showing substantial runtime advantages and robust recognition under partial observations. This work paves the way for adaptable, scalable GR systems in dynamic environments.

Abstract

Traditionally, Reinforcement Learning (RL) problems are aimed at optimization of the behavior of an agent. This paper proposes a novel take on RL, which is used to learn the policy of another agent, to allow real-time recognition of that agent's goals. Goal Recognition (GR) has traditionally been framed as a planning problem where one must recognize an agent's objectives based on its observed actions. Recent approaches have shown how reinforcement learning can be used as part of the GR pipeline, but are limited to recognizing predefined goals and lack scalability in domains with a large goal space. This paper formulates a novel problem, "Online Dynamic Goal Recognition" (ODGR), as a first step to address these limitations. Contributions include introducing the concept of dynamic goals into the standard GR problem definition, revisiting common approaches by reformulating them using ODGR, and demonstrating the feasibility of solving ODGR in a navigation domain using transfer learning. These novel formulations open the door for future extensions of existing transfer learning-based GR methods, which will be robust to changing and expansive real-time environments.
Paper Structure (12 sections, 1 equation, 2 figures, 2 tables, 3 algorithms)

This paper contains 12 sections, 1 equation, 2 figures, 2 tables, 3 algorithms.

Figures (2)

  • Figure 1: OSGR and ODGR. The symbol $t_i$ denotes the initiation time of each process.
  • Figure 2: The heuristic Q-table generated from Q-tables whose base goals are at (1,6), (6,1), and (6,6) using cosine similarity and softmax (without scaling) at the $8 \times 8$ environment. Each cell shows the Q-values for taking the actions up, right, down, and left. The left (right) grid shows the outcome Q-table without (with) scaling.

Theorems & Definitions (2)

  • Definition 1
  • Definition 2