Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs

Raphael Norman-Tenazas; David Kleinberg; Erik C. Johnson; Daniel P. Lathrop; Matthew J. Roos

Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs

Raphael Norman-Tenazas, David Kleinberg, Erik C. Johnson, Daniel P. Lathrop, Matthew J. Roos

TL;DR

This work uses evolutionary computation to evolve the Boolean functions of network nodes through evolution of the network's node functions, obtaining an accuracy improvement of ~30% on an image classification task while processing at a rate of over three million samples per second.

Abstract

It has been shown that unclocked, recurrent networks of Boolean gates in FPGAs can be used for low-SWaP reservoir computing. In such systems, topology and node functionality of the network are randomly initialized. To create a network that solves a task, weights are applied to output nodes and learning is achieved by adjusting those weights with conventional machine learning methods. However, performance is often limited compared to networks where all parameters are learned. Herein, we explore an alternative learning approach for unclocked, recurrent networks in FPGAs. We use evolutionary computation to evolve the Boolean functions of network nodes. In one type of implementation the output nodes are used directly to perform a task and all learning is via evolution of the network's node functions. In a second type of implementation a back-end classifier is used as in traditional reservoir computing. In that case, both evolution of node functions and adjustment of output node weights contribute to learning. We demonstrate the practicality of node function evolution, obtaining an accuracy improvement of ~30% on an image classification task while processing at a rate of over three million samples per second. We additionally demonstrate evolvability of network memory and dynamic output signals.

Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs

TL;DR

Abstract

Paper Structure (7 sections, 4 figures)

This paper contains 7 sections, 4 figures.

Introduction
Methods
Experiments and Results
Image classification
Dynamic output
Temporal memory
Conclusion

Figures (4)

Figure 1: We implemented unclocked, recurrent networks of Boolean gates (e.g., "parent 1") on a PYNQ-Z1 development board hosting a Xilinx Zynq FPGA. Network node functions are reconfigurable LUTs defined by a vector of bits (3-input, 1-output LUTs in this depiction, but 5-input, 1-output LUTs in our PYNQ implementations). Such networks exhibit nanosecond-scale analog-like dynamics (lower left) that are not perfectly repeated for repeated digital inputs (overlayed traces). The discrete nature of digital logic prevents the use of gradient descent in on-hardware network training. Instead, we use evolutionary computation, in which genes (LUTs) of parent networks are combined and mutated to produce child networks. Only the best performing child networks are then used as parents when creating the subsequent generation of networks.
Figure 2: Left: Handwritten digits of 784 (28x28) pixels are converted to 32-bit representations by discarding the least informative bits. This results in some information loss, but is done to accommodate use of the low-cost, low-resource PYNQ development board. Right: A network of 100 LUTs is used to process input digits (vectors) and make classification predictions. Before evolution, the traditional reservoir computing paradigm (“RPU-RC”), which uses a trained back-end classifier, performs better than a naïve network with no back-end classifier (“RPU”). After evolution, both RPU and RPU-RC perform much better than before evolution. After evolution, the use of the back-end classifier in the RPU-RC provides little benefit over the RPU. Nearly all processing is done internally by the dynamic interactions of the recurrent network, and the activation of one of ten output nodes directly indicates the classification prediction.
Figure 3: Performance of a network evolved to perform “digital-to-frequency” conversion. The 24-LUT network’s task is to take in a 4-bit integer (as four separate binary inputs, and representable as the decimal numbers 0 to 15) and generate a single output that fluctuates at a rate correlated with the value of the input decimal number. For this trained network, the correlation coefficient between the input values and the output fluctuation rates (“output frequency”) was 0.98. This experiment demonstrates the ability to evolve networks that have targeted dynamic outputs—a capability necessary for certain types of tasks such as operation of a control system.
Figure 4: Behavior of the final generation of networks evolved to perform an N-back task, with N=3. Each row is the output bit of one of 100 networks, while performing the task. Time (sample number) is from left to right. Input bits were each presented for 0.16 $\mu$s (16 samples of the output node), and transitions are indicated by red vertical lines. The input bit presented between each pair of red lines is the number printed above that column. Output bits are colored yellow (1) and blue (0). As desired, outputs are equal to inputs (when averaged over the duration of an input bit) except with a delay of 3 input digits. Nearly all members of the population perform well, although one poorly-performing child can be observed near row (network/organism) 55. Variation across rows is due to differences in LUT tables and to electronic noise owing to the analog nature of the circuit. This experiment demonstrates the ability to evolve networks that have intrinsic memory---a capability necessary for certain types of tasks such as speech recognition and RF signal classification.

Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs

TL;DR

Abstract

Using evolutionary computation to optimize task performance of unclocked, recurrent Boolean circuits in FPGAs

Authors

TL;DR

Abstract

Table of Contents

Figures (4)