Mind the Gap: The Divergence Between Human and LLM-Generated Tasks
Yi-Long Lu, Jiajun Song, Chunhui Zhang, Wei Wang
TL;DR
The paper investigates whether large language model (LLM) agents replicate the value-driven, embodied processes underlying human autonomous task generation. It combines two experiments: a human baseline assessing how personal values and cognitive style, under different environmental conditions, shape task content, and a GPT-4o–based comparison where the model is either raw or conditioned on human profiles. Results show humans generate tasks that are systematically guided by values and environmental context, whereas LLM outputs are more abstract, less social, and less grounded in embodiment, even when provided with value profiles; paradoxically, LLM tasks can feel more novel and fun but are less feasible in real-world, embodied terms. The findings reveal a core gap between human motivational grounding and the statistical patterns of current LLMs, underscoring the need to incorporate intrinsic motivation and physical grounding to achieve more human-aligned autonomous agents.
Abstract
Humans constantly generate a diverse range of tasks guided by internal motivations. While generative agents powered by large language models (LLMs) aim to simulate this complex behavior, it remains uncertain whether they operate on similar cognitive principles. To address this, we conducted a task-generation experiment comparing human responses with those of an LLM agent (GPT-4o). We find that human task generation is consistently influenced by psychological drivers, including personal values (e.g., Openness to Change) and cognitive style. Even when these psychological drivers are explicitly provided to the LLM, it fails to reflect the corresponding behavioral patterns. They produce tasks that are markedly less social, less physical, and thematically biased toward abstraction. Interestingly, while the LLM's tasks were perceived as more fun and novel, this highlights a disconnect between its linguistic proficiency and its capacity to generate human-like, embodied goals. We conclude that there is a core gap between the value-driven, embodied nature of human cognition and the statistical patterns of LLMs, highlighting the necessity of incorporating intrinsic motivation and physical grounding into the design of more human-aligned agents.
