PAARS: Persona Aligned Agentic Retail Shoppers
Saab Mansour, Leonardo Perelli, Lorenzo Mainetti, George Davidson, Stefano D'Amato
TL;DR
PAARS proposes a persona-driven framework to simulate retail shoppers with LLM agents, addressing biases and privacy by inducing synthetic personas from anonymized histories. It introduces an alignment suite that evaluates both individual and group similarity to human shoppers, using KL divergence to capture population-level fidelity. Empirical results show that persona conditioning improves query generation, item selection, and session diversity, and demonstrate a preliminary agent-based A/B testing capability. The framework is positioned as scalable and domain-agnostic, with potential applications in offline experimentation, surveying, and inclusivity of underrepresented groups, while highlighting ethical and methodological limitations that warrant careful future work.
Abstract
In e-commerce, behavioral data is collected for decision making which can be costly and slow. Simulation with LLM powered agents is emerging as a promising alternative for representing human population behavior. However, LLMs are known to exhibit certain biases, such as brand bias, review rating bias and limited representation of certain groups in the population, hence they need to be carefully benchmarked and aligned to user behavior. Ultimately, our goal is to synthesise an agent population and verify that it collectively approximates a real sample of humans. To this end, we propose a framework that: (i) creates synthetic shopping agents by automatically mining personas from anonymised historical shopping data, (ii) equips agents with retail-specific tools to synthesise shopping sessions and (iii) introduces a novel alignment suite measuring distributional differences between humans and shopping agents at the group (i.e. population) level rather than the traditional "individual" level. Experimental results demonstrate that using personas improves performance on the alignment suite, though a gap remains to human behaviour. We showcase an initial application of our framework for automated agentic A/B testing and compare the findings to human results. Finally, we discuss applications, limitations and challenges setting the stage for impactful future work.
