Table of Contents
Fetching ...

RobustFill: Neural Program Learning under Noisy I/O

Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, Pushmeet Kohli

TL;DR

RobustFill addresses automatic programming from input/output examples by introducing an attentional RNN that encodes variable-sized, unordered I/O sets and a DSL for string transformations. It systematically compares neural program synthesis and neural program induction on a real-world FlashFill-like task, achieving 92% generalization with state-of-the-art synthesis and revealing robustness to noisy data. The work highlights that synthesis excels when exact all-examples correctness is required, while induction can offer advantages under per-example evaluation, and it emphasizes the importance of architecture choices such as late pooling and double attention. Overall, the results demonstrate that neural methods can rival hand-engineered systems on practical, noisy I/O tasks and provide guidance on when to use synthesis versus induction in end-user applications.

Abstract

The problem of automatically generating a computer program from some specification has been studied since the early days of AI. Recently, two competing approaches for automatic program learning have received significant attention: (1) neural program synthesis, where a neural network is conditioned on input/output (I/O) examples and learns to generate a program, and (2) neural program induction, where a neural network generates new outputs directly using a latent program representation. Here, for the first time, we directly compare both approaches on a large-scale, real-world learning task. We additionally contrast to rule-based program synthesis, which uses hand-crafted semantics to guide the program generation. Our neural models use a modified attention RNN to allow encoding of variable-sized sets of I/O pairs. Our best synthesis model achieves 92% accuracy on a real-world test set, compared to the 34% accuracy of the previous best neural synthesis approach. The synthesis model also outperforms a comparable induction model on this task, but we more importantly demonstrate that the strength of each approach is highly dependent on the evaluation metric and end-user application. Finally, we show that we can train our neural models to remain very robust to the type of noise expected in real-world data (e.g., typos), while a highly-engineered rule-based system fails entirely.

RobustFill: Neural Program Learning under Noisy I/O

TL;DR

RobustFill addresses automatic programming from input/output examples by introducing an attentional RNN that encodes variable-sized, unordered I/O sets and a DSL for string transformations. It systematically compares neural program synthesis and neural program induction on a real-world FlashFill-like task, achieving 92% generalization with state-of-the-art synthesis and revealing robustness to noisy data. The work highlights that synthesis excels when exact all-examples correctness is required, while induction can offer advantages under per-example evaluation, and it emphasizes the importance of architecture choices such as late pooling and double attention. Overall, the results demonstrate that neural methods can rival hand-engineered systems on practical, noisy I/O tasks and provide guidance on when to use synthesis versus induction in end-user applications.

Abstract

The problem of automatically generating a computer program from some specification has been studied since the early days of AI. Recently, two competing approaches for automatic program learning have received significant attention: (1) neural program synthesis, where a neural network is conditioned on input/output (I/O) examples and learns to generate a program, and (2) neural program induction, where a neural network generates new outputs directly using a latent program representation. Here, for the first time, we directly compare both approaches on a large-scale, real-world learning task. We additionally contrast to rule-based program synthesis, which uses hand-crafted semantics to guide the program generation. Our neural models use a modified attention RNN to allow encoding of variable-sized sets of I/O pairs. Our best synthesis model achieves 92% accuracy on a real-world test set, compared to the 34% accuracy of the previous best neural synthesis approach. The synthesis model also outperforms a comparable induction model on this task, but we more importantly demonstrate that the strength of each approach is highly dependent on the evaluation metric and end-user application. Finally, we show that we can train our neural models to remain very robust to the type of noise expected in real-world data (e.g., typos), while a highly-engineered rule-based system fails entirely.

Paper Structure

This paper contains 23 sections, 4 equations, 16 figures.

Figures (16)

  • Figure 1: An anonymized example from FlashFillTest with noise (typos). The goal of the task is to fill in the blank (i.e., '?' = 'Green, Steve'). Synthesis approaches achieve this by generating a program like the one shown. Induction approaches generate the new output string directly, conditioned on the the other examples.
  • Figure 2: Syntax of the string transformation DSL.
  • Figure 3: The network architectures used for program synthesis. A dotted line from $x$ to $y$ means that $x$ attends to $y$.
  • Figure 4: Generalization results for program synthesis using several network architectures.
  • Figure 5: Results were obtained using Attention-C.
  • ...and 11 more figures