Fine-Tuning Language Models Using Formal Methods Feedback

Yunhao Yang; Neel P. Bhatt; Tyler Ingebrand; William Ward; Steven Carr; Zhangyang Wang; Ufuk Topcu

Fine-Tuning Language Models Using Formal Methods Feedback

Yunhao Yang, Neel P. Bhatt, Tyler Ingebrand, William Ward, Steven Carr, Zhangyang Wang, Ufuk Topcu

TL;DR

The paper tackles the challenge of domain-specific control policy synthesis for autonomous systems by proposing a fully automated fine-tuning pipeline, DPO-AF, that uses formal-method feedback instead of human labeling. It converts language-model outputs into automaton-based controllers, verifies them against external specifications in a world model, ranks responses by how many specs they satisfy, and fine-tunes the LM via Direct Preference Optimization using these automated signals. Key contributions include the formalization of automaton-based representations for NL tasks, a prompting and controller-construction workflow (GLM2FSA) to generate verifiable controllers, and an experimental demonstration on autonomous driving showing spec satisfaction rising from $0.60$ to $0.90$ with Carla-backed empirical validation and real-world grounding. The work reduces human-in-the-loop labor, provides mathematical guarantees through formal verification where possible, and highlights practical viability for safe deployment in real-world autonomous systems.

Abstract

Although pre-trained language models encode generic knowledge beneficial for planning and control, they may fail to generate appropriate control policies for domain-specific tasks. Existing fine-tuning methods use human feedback to address this limitation, however, sourcing human feedback is labor intensive and costly. We present a fully automated approach to fine-tune pre-trained language models for applications in autonomous systems, bridging the gap between generic knowledge and domain-specific requirements while reducing cost. The method synthesizes automaton-based controllers from pre-trained models guided by natural language task descriptions. These controllers are verifiable against independently provided specifications within a world model, which can be abstract or obtained from a high-fidelity simulator. Controllers with high compliance with the desired specifications receive higher ranks, guiding the iterative fine-tuning process. We provide quantitative evidences, primarily in autonomous driving, to demonstrate the method's effectiveness across multiple tasks. The results indicate an improvement in percentage of specifications satisfied by the controller from 60% to 90%.

Fine-Tuning Language Models Using Formal Methods Feedback

TL;DR

with Carla-backed empirical validation and real-world grounding. The work reduces human-in-the-loop labor, provides mathematical guarantees through formal verification where possible, and highlights practical viability for safe deployment in real-world autonomous systems.

Abstract

Paper Structure (45 sections, 1 theorem, 4 equations, 18 figures, 1 algorithm)

This paper contains 45 sections, 1 theorem, 4 equations, 18 figures, 1 algorithm.

Introduction
Related Work
Fine-tuning from Human Feedback.
Fine-tuning from Generated Outputs
Formal Methods and Verification on Language Models.
Preliminaries
Automaton-Based Model for System or Environment.
Automaton-Based Controller.
Fine-Tuning Pre-trained Language Model for Autonomous System
Automaton-Based Representation for Natural Language and Autonomous System
Modeling the Autonomous System.
Task Prompt Engineering.
Controller Construction.
Automated Feedback
Formal Verification.
...and 30 more sections

Key Result

Theorem 1

If $\mathcal{M}$ captures complete information from $\mathcal{S}$, then

Figures (18)

Figure 1: Examples of an automaton-based model (top) and a controller (bottom).
Figure 2: The overall pipeline of fine-tuning a language model for autonomous systems via automated feedback. We mark the inputs to the pipeline in purple and the output in blue.
Figure 3: This diagram depicts the method of ranking responses by formal verification of the induced automata. We present the sample automata in Figures \ref{['fig: regular_traffic_light']} and \ref{['fig: right']}.
Figure 4: Illustration of two sample scenarios from the autonomous driving system. The left figure is an intersection with the traffic light. We encode this scenario in a model in Figure \ref{['fig: regular_traffic_light']}. The right figure is an intersection with a wide median. We encode it in a model in Figure \ref{['fig: left_turn_at_wide_median']}.
Figure 5: An automaton-based model represents a vehicle's environment dynamics at a regular traffic signal at an intersection. TL represents "traffic light," and ped represents "pedestrian."
...and 13 more figures

Theorems & Definitions (3)

Definition 1
Theorem 1
proof

Fine-Tuning Language Models Using Formal Methods Feedback

TL;DR

Abstract

Fine-Tuning Language Models Using Formal Methods Feedback

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (18)

Theorems & Definitions (3)