Learning Interpretable Differentiable Logic Networks
Chang Yue, Niraj K. Jha
TL;DR
This work introduces Differentiable Logic Networks (DLNs), a class of architectures that learn interpretable Boolean rules through gradient-based optimization. By decomposing training into two phases—searching neuron functions and then learning connections—and by employing ThresholdLayer, LogicLayer, and SumLayer with differentiable relaxations and straight-through estimators, DLNs achieve competitive accuracy with far fewer gate-level operations, enabling edge deployment. Post-training simplification yields compact, human-readable logic expressions, enhancing interpretability without sacrificing performance on many tabular tasks. While DLNs excel in many settings, training cost and occasional dataset-specific limitations motivate future work on ensembles and prior-rule integration to broaden applicability.
Abstract
The ubiquity of neural networks (NNs) in real-world applications, from healthcare to natural language processing, underscores their immense utility in capturing complex relationships within high-dimensional data. However, NNs come with notable disadvantages, such as their "black-box" nature, which hampers interpretability, as well as their tendency to overfit the training data. We introduce a novel method for learning interpretable differentiable logic networks (DLNs) that are architectures that employ multiple layers of binary logic operators. We train these networks by softening and differentiating their discrete components, e.g., through binarization of inputs, binary logic operations, and connections between neurons. This approach enables the use of gradient-based learning methods. Experimental results on twenty classification tasks indicate that differentiable logic networks can achieve accuracies comparable to or exceeding that of traditional NNs. Equally importantly, these networks offer the advantage of interpretability. Moreover, their relatively simple structure results in the number of logic gate-level operations during inference being up to a thousand times smaller than NNs, making them suitable for deployment on edge devices.
