A Neurosymbolic Approach to Natural Language Formalization and Verification
Sam Bayless, Stefano Buliani, Darion Cassel, Byron Cook, Duncan Clough, Rémi Delmas, Nafi Diallo, Ferhat Erata, Nick Feng, Dimitra Giannakopoulou, Aman Goel, Aditya Gokhale, Joe Hendrix, Marc Hudak, Dejan Jovanović, Andrew M. Kent, Benjamin Kiesl-Reiter, Jeffrey J. Kuna, Nadia Labai, Joseph Lilien, Divya Raghunathan, Zvonimir Rakamarić, Niloofar Razavi, Michael Tautschnig, Ali Torkamani, Nathaniel Weir, Michael W. Whalen, Jianan Yao
TL;DR
This work tackles the challenge of deploying LLMs in regulated domains by mitigating hallucinations with auditable guardrails. It introduces ARc, a neurosymbolic framework consisting of the Policy Model Creator (PMC) for offline formalization of NL policies and the Answer Verifier (AV) for inference-time validation of NL content against those policies. ARc achieves high assurance, reporting near-$100\%$ soundness on unseen data and providing detailed, auditable reasoning and corrective feedback. Human policy vetting further improves performance and demonstrates practical applicability to real-world policies, while highlighting tradeoffs between safety and coverage and outlining avenues for automation and scalability in future work.
Abstract
Large Language Models perform well at natural language interpretation and reasoning, but their inherent stochasticity limits their adoption in regulated industries like finance and healthcare that operate under strict policies. To address this limitation, we present a two-stage neurosymbolic framework that (1) uses LLMs with optional human guidance to formalize natural language policies, allowing fine-grained control of the formalization process, and (2) uses inference-time autoformalization to validate logical correctness of natural language statements against those policies. When correctness is paramount, we perform multiple redundant formalization steps at inference time, cross checking the formalizations for semantic equivalence. Our benchmarks demonstrate that our approach exceeds 99% soundness, indicating a near-zero false positive rate in identifying logical validity. Our approach produces auditable logical artifacts that substantiate the verification outcomes and can be used to improve the original text.
