Data-Efficient Learning with Neural Programs

Alaia Solko-Breslin; Seewon Choi; Ziyang Li; Neelay Velingker; Rajeev Alur; Mayur Naik; Eric Wong

Data-Efficient Learning with Neural Programs

Alaia Solko-Breslin, Seewon Choi, Ziyang Li, Neelay Velingker, Rajeev Alur, Mayur Naik, Eric Wong

TL;DR

This work presents an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components and introduces new benchmarks that involve calls to modern LLMs such as GPT-4 and also considers benchmarks from the neurosymbolic learning literature.

Abstract

Many computational tasks can be naturally expressed as a composition of a DNN followed by a program written in a traditional programming language or an API call to an LLM. We call such composites "neural programs" and focus on the problem of learning the DNN parameters when the training data consist of end-to-end input-output labels for the composite. When the program is written in a differentiable logic programming language, techniques from neurosymbolic learning are applicable, but in general, the learning for neural programs requires estimating the gradients of black-box components. We present an algorithm for learning neural programs, called ISED, that only relies on input-output samples of black-box components. For evaluation, we introduce new benchmarks that involve calls to modern LLMs such as GPT-4 and also consider benchmarks from the neurosymbolic learning literature. Our evaluation shows that for the latter benchmarks, ISED has comparable performance to state-of-the-art neurosymbolic frameworks. For the former, we use adaptations of prior work on gradient approximations of black-box components as a baseline, and show that ISED achieves comparable accuracy but in a more data- and sample-efficient manner.

Data-Efficient Learning with Neural Programs

TL;DR

Abstract

Paper Structure (27 sections, 9 equations, 4 figures, 13 tables, 1 algorithm)

This paper contains 27 sections, 9 equations, 4 figures, 13 tables, 1 algorithm.

Introduction
Neural Programs
Learning Neural Programs
ISED Overview
Preliminaries and Programming Interface
Algorithm
Evaluation
Benchmark Tasks: NeuroGPT, NeuroPython, and Neurosymbolic
Evaluation Setup and Baselines
RQ1: Performance and Accuracy
RQ2: Sample Efficiency
RQ3: Data Efficiency
Limitations and Future Work
Related Work
Conclusion
...and 12 more sections

Figures (4)

Figure 1: Neural program decomposition for scene recognition.
Figure 2: Illustration of our inference pipeline for the leaf classification task. leaf_id can be written with a decision tree (top program) or with a call to GPT-4 (bottom program).
Figure 3: Accuracy vs. Time for sum$_3$.
Figure 4: Accuracy vs. Time for sum$_4$.

Data-Efficient Learning with Neural Programs

TL;DR

Abstract

Data-Efficient Learning with Neural Programs

Authors

TL;DR

Abstract

Table of Contents

Figures (4)