DeepDFA: Automata Learning through Neural Probabilistic Relaxations
Elena Umili, Roberto Capobianco
TL;DR
DeepDFA presents a differentiable, probabilistic relaxation of DFAs that can be trained with gradient descent and then converted to crisp DFAs via temperature annealing and Hopcroft minimization. It combines the interpretability of classical automata with the scalability and noise-tolerance of neural models, enabling efficient learning from traces even with imperfect grounding. Across Tomita benchmarks and random DFAs, DeepDFA achieves high accuracy with state counts close to the target and outperforms SAT-based and RNN-based baselines, particularly under label and symbol noise. This approach offers a practical, scalable pathway for automatic grammar induction and DFA identification with robust performance in real-world noisy settings.
Abstract
In this work, we introduce DeepDFA, a novel approach to identifying Deterministic Finite Automata (DFAs) from traces, harnessing a differentiable yet discrete model. Inspired by both the probabilistic relaxation of DFAs and Recurrent Neural Networks (RNNs), our model offers interpretability post-training, alongside reduced complexity and enhanced training efficiency compared to traditional RNNs. Moreover, by leveraging gradient-based optimization, our method surpasses combinatorial approaches in both scalability and noise resilience. Validation experiments conducted on target regular languages of varying size and complexity demonstrate that our approach is accurate, fast, and robust to noise in both the input symbols and the output labels of training data, integrating the strengths of both logical grammar induction and deep learning.
