FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning

Jinzi Zou; Bolin Wang; Liang Li; Shuo Zhang; Nuo Xu; Junzhou Zhao

FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning

Jinzi Zou, Bolin Wang, Liang Li, Shuo Zhang, Nuo Xu, Junzhou Zhao

TL;DR

This work formalizes Flowchart-Oriented Dialogue (FOD), where multi-turn conversations must progress along a flowchart topology with faithful node-grounding and valid transitions. It introduces FloCA, a zero-shot agent that delegates topology-aware reasoning to a dedicated flowchart reasoning tool while handling intent understanding and response generation with an instruction-following LLM. A novel interactive evaluation framework pairs FloCA with an LLM-based user simulator and five metrics to assess reasoning accuracy and interaction efficiency. Empirical results on FLODIAL and PFDial show FloCA achieving state-of-the-art task success and stronger logical consistency than baselines that rely on RAG, VLMs, or fine-tuning, demonstrating the value of explicit topology-constrained graph execution. The work also provides a practical framework and codebase to advance faithful, flowchart-guided reasoning in real-world decision-support and procedural tasks.

Abstract

Flowchart-oriented dialogue (FOD) systems aim to guide users through multi-turn decision-making or operational procedures by following a domain-specific flowchart to achieve a task goal. In this work, we formalize flowchart reasoning in FOD as grounding user input to flowchart nodes at each dialogue turn while ensuring node transition is consistent with the correct flowchart path. Despite recent advances of LLMs in task-oriented dialogue systems, adapting them to FOD still faces two limitations: (1) LLMs lack an explicit mechanism to represent and reason over flowchart topology, and (2) they are prone to hallucinations, leading to unfaithful flowchart reasoning. To address these limitations, we propose FloCA, a zero-shot flowchart-oriented conversational agent. FloCA uses an LLM for intent understanding and response generation while delegating flowchart reasoning to an external tool that performs topology-constrained graph execution, ensuring faithful and logically consistent node transitions across dialogue turns. We further introduce an evaluation framework with an LLM-based user simulator and five new metrics covering reasoning accuracy and interaction efficiency. Extensive experiments on FLODIAL and PFDial datasets highlight the bottlenecks of existing LLM-based methods and demonstrate the superiority of FloCA. Our codes are available at https://github.com/Jinzi-Zou/FloCA-flowchart-reasoning.

FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning

TL;DR

Abstract

Paper Structure (17 sections, 5 equations, 4 figures, 3 tables)

This paper contains 17 sections, 5 equations, 4 figures, 3 tables.

Introduction
Related Work
Flowchart-oriented Dialogues
FloCA: A Flowchart Reasoning Agent
Flowchart Reasoning Tool
Initial Node Grounding
Interactive Flowchart Reasoning
Domain Knowledge QA
Evaluation Framework for FOD Task
User Simulator
Flowchart Reasoning Metrics
Experiments
Settings
Results on FLODIAL Dataset
Results on PFDial Dataset
...and 2 more sections

Figures (4)

Figure 1: Comparison of workflows for enabling LLMs to perform flowchart reasoning. (a) Retrieves node attributes that are semantically similar to the input but incorrect due to retrieval errors; (b) and (c) compromise the flowchart's structural integrity and suffer from hallucination; (d) leverages a flowchart reasoning tool to preserve topology and ensure faithful and logically consistent reasoning.
Figure 2: An overview of FloCA. FloCA consists of two core components: an instruction-following LLM and a faithful flowchart reasoning tool. The left figure shows an example dialogue in troubleshooting, where each agent output corresponds to the result of the flowchart reasoning for a specific flowchart node. The right figure depicts the entire multi-turn reasoning process, with colors representing the graph functions and LLM reasoning processes, showing the input and output at each step and how reasoning is carried out.
Figure 3: Local flowchart reasoning accuracy around domain knowledge QA on FLODIAL. "GS", "RAG+GS" and "VLMs" denote baselines of graph serialization methods, RAG-enhanced graph serialization methods, and visual language models, respectively.
Figure 4: Distribution of path-coverage relations on the PFDial in the in-domain setting.

FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning

TL;DR

Abstract

FloCA: Towards Faithful and Logically Consistent Flowchart Reasoning

Authors

TL;DR

Abstract

Table of Contents

Figures (4)