Learning to Infer Graphics Programs from Hand-Drawn Images
Kevin Ellis, Daniel Ritchie, Armando Solar-Lezama, Joshua B. Tenenbaum
TL;DR
This work tackles inferring high-level graphics programs from hand-drawn diagrams by factoring the problem into image-to-spec and spec-to-program steps. It combines a neural network that proposes drawing commands with Sequential Monte Carlo and a constraint-based DSL synthesizer, further accelerated by a learned bias-optimal search policy. The approach generalizes to noisy real drawings, supports correcting neural proposals, enables program-based similarity measurements, and enables extrapolation of repetitive patterns. Together, these results show a viable path toward automatically inducing human-readable programs that generate perceptual input, with practical implications for figure generation and interpretation.
Abstract
We introduce a model that learns to convert simple hand drawings into graphics programs written in a subset of \LaTeX. The model combines techniques from deep learning and program synthesis. We learn a convolutional neural network that proposes plausible drawing primitives that explain an image. These drawing primitives are like a trace of the set of primitive commands issued by a graphics program. We learn a model that uses program synthesis techniques to recover a graphics program from that trace. These programs have constructs like variable bindings, iterative loops, or simple kinds of conditionals. With a graphics program in hand, we can correct errors made by the deep network, measure similarity between drawings by use of similar high-level geometric structures, and extrapolate drawings. Taken together these results are a step towards agents that induce useful, human-readable programs from perceptual input.
