Table of Contents
Fetching ...

Exploring the hierarchical structure of human plans via program generation

Carlos G. Correa, Sophia Sanborn, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

TL;DR

This work addresses how humans form hierarchically structured plans by analyzing the programs people write to solve Lightbot tasks. It introduces a grammar-induction framework, based on adaptor grammars and Dirichlet Processes, to capture a reuse bias in subroutines that is not predicted by MDL or simple utility minimization. Empirical results show that the grammar induction model, especially when combined with a step-cost prior, best predicts participants' programs and explains qualitative reuse patterns beyond compressibility. The findings suggest that hierarchical planning is guided by rich-get-richer reuse dynamics and that explicit hierarchical representations can simplify planning and execution, with implications for models of human planning and program generation.

Abstract

Human behavior is often assumed to be hierarchically structured, made up of abstract actions that can be decomposed into concrete actions. However, behavior is typically measured as a sequence of actions, which makes it difficult to infer its hierarchical structure. In this paper, we explore how people form hierarchically structured plans, using an experimental paradigm with observable hierarchical representations: participants create programs that produce sequences of actions in a language with explicit hierarchical structure. This task lets us test two well-established principles of human behavior: utility maximization (i.e. using fewer actions) and minimum description length (MDL; i.e. having a shorter program). We find that humans are sensitive to both metrics, but that both accounts fail to predict a qualitative feature of human-created programs, namely that people prefer programs with reuse over and above the predictions of MDL. We formalize this preference for reuse by extending the MDL account into a generative model over programs, modeling hierarchy choice as the induction of a grammar over actions. Our account can explain the preference for reuse and provides better predictions of human behavior, going beyond simple accounts of compressibility to highlight a principle that guides hierarchical planning.

Exploring the hierarchical structure of human plans via program generation

TL;DR

This work addresses how humans form hierarchically structured plans by analyzing the programs people write to solve Lightbot tasks. It introduces a grammar-induction framework, based on adaptor grammars and Dirichlet Processes, to capture a reuse bias in subroutines that is not predicted by MDL or simple utility minimization. Empirical results show that the grammar induction model, especially when combined with a step-cost prior, best predicts participants' programs and explains qualitative reuse patterns beyond compressibility. The findings suggest that hierarchical planning is guided by rich-get-richer reuse dynamics and that explicit hierarchical representations can simplify planning and execution, with implications for models of human planning and program generation.

Abstract

Human behavior is often assumed to be hierarchically structured, made up of abstract actions that can be decomposed into concrete actions. However, behavior is typically measured as a sequence of actions, which makes it difficult to infer its hierarchical structure. In this paper, we explore how people form hierarchically structured plans, using an experimental paradigm with observable hierarchical representations: participants create programs that produce sequences of actions in a language with explicit hierarchical structure. This task lets us test two well-established principles of human behavior: utility maximization (i.e. using fewer actions) and minimum description length (MDL; i.e. having a shorter program). We find that humans are sensitive to both metrics, but that both accounts fail to predict a qualitative feature of human-created programs, namely that people prefer programs with reuse over and above the predictions of MDL. We formalize this preference for reuse by extending the MDL account into a generative model over programs, modeling hierarchy choice as the induction of a grammar over actions. Our account can explain the preference for reuse and provides better predictions of human behavior, going beyond simple accounts of compressibility to highlight a principle that guides hierarchical planning.
Paper Structure (40 sections, 15 equations, 22 figures, 5 tables, 2 algorithms)

This paper contains 40 sections, 15 equations, 22 figures, 5 tables, 2 algorithms.

Figures (22)

  • Figure 1: The interface for Lightbot, a process-tracing experiment for complex, hierarchical plans. Participants create programs by dragging instructions from the lower right to define a program. The program is executed by the robot from the initial subroutine ("Main"), and the task is completed when all blue squares in the environment are activated. An example program that solves the task is shown. The program includes a single subroutine ("Process 1") with four actions (Walk, Walk, Walk, Activate Light). Participants can use up to four subroutines (referred to as processes in the experiment), but for brevity only two are pictured. The Run button executes the program with an animation of the robot taking each action. The program is executed without animation with the Quick Run button. The program length is displayed as an Instruction Count, and the white buttons at the top are used to clear all instructions from a subroutine for ease of program editing.
  • Figure 1: Example programs demonstrating the influence of step costs on participant programs. The most common hierarchical program (first column) uses a subroutine (Turn Left, Jump, Activate Light) three times. A program discovered by program search (second column) has the same subroutine and uses it four times. This pattern of choice is inconsistent with MDL and grammar induction, but is explained by step count. Third and fourth columns are a similar example, with the subroutine Walk, Turn Left, Jump, Activate Light. Fifth column shows that participants will use a shorter subroutine (Jump, Activate Light) four times. See Fig. \ref{['fig:sample-programs0']} for more detail about this figure.
  • Figure 2: Two example programs that solve the task are shown in Fig. \ref{['fig:intro-program-writing-ui']}. The top row contains the same program as Fig. \ref{['fig:intro-program-writing-ui']}, demonstrating the execution trace, or sequence of resulting actions, in blue. Table rows correspond to distinct programs. Table columns show a program execution trace, a tree representation of the program, the step count (number of actions) resulting from program execution, and the program length. The tree representation of a program shows how the sequence of actions relates to the program structure. Actions are executed in sequence, from left to right. Subroutines are shown with green lines that connect the subroutine call to its constituent instructions. The first use of a subroutine has solid lines, while subsequent uses have dashed lines. In the experiment, participants only see an animation of the robot, not the execution trace in blue. Legend: S: Activate Light, W: Walk, J: Jump, R: Turn Right, L: Turn Left.
  • Figure 2: Model comparison of accounts, without program preprocessing. a) Plot of BIC of experimental data as predicted by each of the models, after parameter fitting. Models with smaller BIC are a better account of behavior. b) BIC of data under each model, but split by task. Parameters are the same as in a), so they are the best fit for all tasks. The color of the models is also the same as in a). Task letter is a reference to the subfigure in Fig. \ref{['fig:exp-task']}.
  • Figure 3: The tasks participants completed in the experiment.
  • ...and 17 more figures