Table of Contents
Fetching ...

DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation

Chen Xu, Tian Lan, Yu Ji, Changlong Yu, Wei Wang, Jun Gao, Qunxi Dong, Kun Qian, Piji Li, Wei Bi, Bin Hu

TL;DR

DECIDER addresses constrained generation limitations by coupling a base language model with a First-Order Logic reasoner (System 2) and a decision function, enabling rule-controllable text synthesis via predicates like $R(x)$. At each step, it computes a logic vector $\mathbf{I}^{\mathcal{V}}$ over the vocabulary and updates the next-word distribution using $\bar{P}^{\mathcal{V}} = f_{decis}(P^{\mathcal{V}}, \mathbf{I}^{\mathcal{V}}; \alpha)$, while perturbing both attention to previous words and to target words. The KB defines rules and facts (including $Edge$ relations in ConceptNet) and the logic is vectorized for parallel evaluation, with a rule parser and prover performing backward-chaining over a parse tree. Experiments on CommonGen and PersonaChat show DECIDER outperforms prior constrained decoding methods in both quality and task satisfaction, highlighting the practical value of neuro-symbolic, rule-based control for scalable, non-finetuned guidance of large language models.

Abstract

Constrained decoding approaches aim to control the meaning or style of text generated by the pre-trained large language models (LLMs or also PLMs) for various tasks at inference time. However, these methods often guide plausible continuations by greedily and explicitly selecting targets. Though fulfilling the task requirements, these methods may overlook certain general and natural logics that humans would implicitly follow towards such targets. Inspired by cognitive dual-process theory, in this work, we propose a novel decoding framework DECIDER where the base LLMs are equipped with a First-Order Logic (FOL) reasoner to express and evaluate the rules, along with a decision function that merges the outputs of both systems to guide the generation. Unlike previous constrained decodings, DECIDER transforms the encouragement of target-specific words into all words that satisfy several high-level rules, enabling us to programmatically integrate our logic into LLMs. Experiments on CommonGen and PersonaChat demonstrate that DECIDER effectively follows given FOL rules to guide LLMs in a more human-like and logic-controlled manner.

DECIDER: A Dual-System Rule-Controllable Decoding Framework for Language Generation

TL;DR

DECIDER addresses constrained generation limitations by coupling a base language model with a First-Order Logic reasoner (System 2) and a decision function, enabling rule-controllable text synthesis via predicates like . At each step, it computes a logic vector over the vocabulary and updates the next-word distribution using , while perturbing both attention to previous words and to target words. The KB defines rules and facts (including relations in ConceptNet) and the logic is vectorized for parallel evaluation, with a rule parser and prover performing backward-chaining over a parse tree. Experiments on CommonGen and PersonaChat show DECIDER outperforms prior constrained decoding methods in both quality and task satisfaction, highlighting the practical value of neuro-symbolic, rule-based control for scalable, non-finetuned guidance of large language models.

Abstract

Constrained decoding approaches aim to control the meaning or style of text generated by the pre-trained large language models (LLMs or also PLMs) for various tasks at inference time. However, these methods often guide plausible continuations by greedily and explicitly selecting targets. Though fulfilling the task requirements, these methods may overlook certain general and natural logics that humans would implicitly follow towards such targets. Inspired by cognitive dual-process theory, in this work, we propose a novel decoding framework DECIDER where the base LLMs are equipped with a First-Order Logic (FOL) reasoner to express and evaluate the rules, along with a decision function that merges the outputs of both systems to guide the generation. Unlike previous constrained decodings, DECIDER transforms the encouragement of target-specific words into all words that satisfy several high-level rules, enabling us to programmatically integrate our logic into LLMs. Experiments on CommonGen and PersonaChat demonstrate that DECIDER effectively follows given FOL rules to guide LLMs in a more human-like and logic-controlled manner.
Paper Structure (22 sections, 11 equations, 6 figures, 8 tables, 1 algorithm)

This paper contains 22 sections, 11 equations, 6 figures, 8 tables, 1 algorithm.

Figures (6)

  • Figure 1: In the CommonGen (left) and PersonaChat (right) tasks, Decider can employs and follows the rules in Table \ref{['tab:kb']} to mitigate the greedy focus on targets, producing logic-controllable text that not only completes task requirements but also better aligns with natural human expression scenarios.
  • Figure 2: Applying dual-system Decider decoding for CommonGen (top solid arrows) and PersonaChat (bottom dashed arrows) with different rules. In these two simple examples, classroom (top) and library (bottom) are selected by the logical reasoner because they meet the rule as the target words. However, they are suppressed by the PLM because they violate the intuition of human fluent speaking. In contrast, camping and grow are good options for the PLM, but bad ones for the logical reasoner. Because their appearance may cause the generation to deviate from the target in the future (e.g., we may not go camping in the classroom). Through the interplay of two systems, Decider will prefer the fluent words that also meet our rules.
  • Figure 3: The logical reasoner aims to produce a logic vector over a vocabulary given a user-defined FOL rule. It has a (a) parser to construct a parse tree and a (b) prover to prove whether words meet the rule based on the tree. In this example, though the $\texttt{learning}$ at the lower right corner is not a target ($\texttt{Equal=0}$), it is semantically relevant to a target $\texttt{classroom}$ since there is an edge typed "used-for" between them in the knowledge graph ConceptNet ($\texttt{Edge=1}$). Hence, the logic disjunction over them plays the role of selecting not only the targets but also the words that contribute to the targets. These words would have been ignored based on the past alone but incorporate the probability of task completion in the future.
  • Figure 4: Illustration of Decider decoding applied to the base transformer-structured language model during generation.
  • Figure 5: Case study for the CommonGen. Underlined words violate an important commonsense that we rarely tell commonsense or details out unless they are special for the current scene. Benefiting from the rule at row 1 in the Tab. \ref{['tab:kb']}, Decider has the ability to guess this special scene at human-level with the external information in purple.
  • ...and 1 more figures