Procedural Knowledge Improves Agentic LLM Workflows

Vincent Hsiao; Mark Roberts; Leslie Smith

Procedural Knowledge Improves Agentic LLM Workflows

Vincent Hsiao, Mark Roberts, Leslie Smith

TL;DR

This paper tackles the difficulty of planning in agentic LLMs by embedding procedural knowledge through Hierarchical Task Networks (HTNs) within an agentic LLM workflow (ProcLLM). It formalizes an HTN+MDP framework and demonstrates, across four benchmarks, that hand-coded HTNs significantly boost task success and can let smaller LLMs outperform larger baselines, with LLM-generated HTNs providing additional but variable gains. The findings suggest that leveraging procedural knowledge from humans and machines will be a key tool for improving LLM workflows in practice, enabling more reliable planning, faster response times, and better scalability across task complexity.

Abstract

Large language models (LLMs) often struggle when performing agentic tasks without substantial tool support, prom-pt engineering, or fine tuning. Despite research showing that domain-dependent, procedural knowledge can dramatically increase planning efficiency, little work evaluates its potential for improving LLM performance on agentic tasks that may require implicit planning. We formalize, implement, and evaluate an agentic LLM workflow that leverages procedural knowledge in the form of a hierarchical task network (HTN). Empirical results of our implementation show that hand-coded HTNs can dramatically improve LLM performance on agentic tasks, and using HTNs can boost a 20b or 70b parameter LLM to outperform a much larger 120b parameter LLM baseline. Furthermore, LLM-created HTNs improve overall performance, though less so. The results suggest that leveraging expertise--from humans, documents, or LLMs--to curate procedural knowledge will become another important tool for improving LLM workflows.

Procedural Knowledge Improves Agentic LLM Workflows

TL;DR

Abstract

Procedural Knowledge Improves Agentic LLM Workflows

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (6)