Rethinking ChatGPT's Success: Usability and Cognitive Behaviors Enabled by Auto-regressive LLMs' Prompting
Xinzhe Li, Ming Liu
TL;DR
The paper analyzes the limitations of task-specific channels in deploying LLMs and argues that auto-regressive prompting with free-form modalities enables richer, human-like cognitive behaviors. It introduces a modalities-and-channels framework to compare deployment paradigms, highlighting how verbal free-form context and open-ended outputs in AR-LLMs yield superior usability and expressiveness. By examining thinking, reasoning, planning, and feedback learning, the work shows how prompting can imitate complex cognitive activities and discusses challenges such as shortcut learning and distribution shifts. The authors propose a path toward improved autonomous and multi-agent LLM deployment through cognitive-behavioral principles and a unified inference framework, advocating for free-form modalities during pretraining and simpler inference-time deployment.
Abstract
Over the last decade, a wide range of training and deployment strategies for Large Language Models (LLMs) have emerged. Among these, the prompting paradigms of Auto-regressive LLMs (AR-LLMs) have catalyzed a significant surge in Artificial Intelligence (AI). This paper aims to emphasize the significance of utilizing free-form modalities (forms of input and output) and verbal free-form contexts as user-directed channels (methods for transforming modalities) for downstream deployment. Specifically, we analyze the structure of modalities within both two types of LLMs and six task-specific channels during deployment. From the perspective of users, our analysis introduces and applies the analytical metrics of task customizability, transparency, and complexity to gauge their usability, highlighting the superior nature of AR-LLMs' prompting paradigms. Moreover, we examine the stimulation of diverse cognitive behaviors in LLMs through the adoption of free-form text and verbal contexts, mirroring human linguistic expressions of such behaviors. We then detail four common cognitive behaviors to underscore how AR-LLMs' prompting successfully imitate human-like behaviors using this free-form modality and channel. Lastly, the potential for improving LLM deployment, both as autonomous agents and within multi-agent systems, is identified via cognitive behavior concepts and principles.
