Table of Contents
Fetching ...

SYNAPSE: SYmbolic Neural-Aided Preference Synthesis Engine

Sadanand Modak, Noah Patton, Isil Dillig, Joydeep Biswas

TL;DR

SYNAPSE addresses the challenge of learning individual human preferences from limited visual demonstrations and natural language by grounding concepts in a neuro-symbolic DSL and incrementally synthesizing executable programs. The method ground concepts via NL explanations, generate a program sketch with numeric holes, and fill parameters through demonstrations using a constrained optimization (Max-SMT) framework, enabling lifelong, sample-efficient learning. Empirical results across mobility and manipulation domains show substantial out-of-distribution generalization and strong multi-user alignment, outperforming neural baselines and existing neurosymbolic methods while requiring far fewer demonstrations. Limitations include dependence on the quality of neural modules and NL input, with future work suggested on probabilistic reasoning and recency weighting to handle noise and changing preferences.

Abstract

This paper addresses the problem of preference learning, which aims to align robot behaviors through learning user specific preferences (e.g. "good pull-over location") from visual demonstrations. Despite its similarity to learning factual concepts (e.g. "red door"), preference learning is a fundamentally harder problem due to its subjective nature and the paucity of person-specific training data. We address this problem using a novel framework called SYNAPSE, which is a neuro-symbolic approach designed to efficiently learn preferential concepts from limited data. SYNAPSE represents preferences as neuro-symbolic programs, facilitating inspection of individual parts for alignment, in a domain-specific language (DSL) that operates over images and leverages a novel combination of visual parsing, large language models, and program synthesis to learn programs representing individual preferences. We perform extensive evaluations on various preferential concepts as well as user case studies demonstrating its ability to align well with dissimilar user preferences. Our method significantly outperforms baselines, especially when it comes to out of distribution generalization. We show the importance of the design choices in the framework through multiple ablation studies. Code, additional results, and supplementary material can be found on the website: https://amrl.cs.utexas.edu/synapse

SYNAPSE: SYmbolic Neural-Aided Preference Synthesis Engine

TL;DR

SYNAPSE addresses the challenge of learning individual human preferences from limited visual demonstrations and natural language by grounding concepts in a neuro-symbolic DSL and incrementally synthesizing executable programs. The method ground concepts via NL explanations, generate a program sketch with numeric holes, and fill parameters through demonstrations using a constrained optimization (Max-SMT) framework, enabling lifelong, sample-efficient learning. Empirical results across mobility and manipulation domains show substantial out-of-distribution generalization and strong multi-user alignment, outperforming neural baselines and existing neurosymbolic methods while requiring far fewer demonstrations. Limitations include dependence on the quality of neural modules and NL input, with future work suggested on probabilistic reasoning and recency weighting to handle noise and changing preferences.

Abstract

This paper addresses the problem of preference learning, which aims to align robot behaviors through learning user specific preferences (e.g. "good pull-over location") from visual demonstrations. Despite its similarity to learning factual concepts (e.g. "red door"), preference learning is a fundamentally harder problem due to its subjective nature and the paucity of person-specific training data. We address this problem using a novel framework called SYNAPSE, which is a neuro-symbolic approach designed to efficiently learn preferential concepts from limited data. SYNAPSE represents preferences as neuro-symbolic programs, facilitating inspection of individual parts for alignment, in a domain-specific language (DSL) that operates over images and leverages a novel combination of visual parsing, large language models, and program synthesis to learn programs representing individual preferences. We perform extensive evaluations on various preferential concepts as well as user case studies demonstrating its ability to align well with dissimilar user preferences. Our method significantly outperforms baselines, especially when it comes to out of distribution generalization. We show the importance of the design choices in the framework through multiple ablation studies. Code, additional results, and supplementary material can be found on the website: https://amrl.cs.utexas.edu/synapse
Paper Structure (25 sections, 11 figures, 9 tables, 3 algorithms)

This paper contains 25 sections, 11 figures, 9 tables, 3 algorithms.

Figures (11)

  • Figure 1: Overview. Human preferences have both qualitative and quantitative aspects. SYNAPSE first learns the necessary predicates (a.k.a. auxiliary concepts) needed to represent the preference from the NL input. It then synthesizes a program sketch which likely has some quantitative holes. This sketch represents the preference qualitatively. Finally, the holes are filled up by an optimization process that uses the physical demonstration data, thereby capturing the quantitative part of the preference.
  • Figure 2: SYNAPSE neuro-symbolic DSL. Representing preference evaluator $\pi$ parametrized over concept library $\mathcal{C}$
  • Figure 3: Preference tasks. We show evaluation on three mobility tasks and one manipulation task. SYNAPSE utilizes pretrained module outputs and executes the learned program.
  • Figure 4: User-study. Higher entries around diagonal show good alignment between learned program and preference.
  • Figure 5: Plot showing susceptibility of SYNAPSE to reordering of demonstrations. Gray area represents the mean IOU (%) variation as SYNAPSE sees more demonstrations.
  • ...and 6 more figures