Table of Contents
Fetching ...

MarketGen: A Scalable Simulation Platform with Auto-Generated Embodied Supermarket Environments

Xu Hu, Yiyang Feng, Junran Peng, Jiawei He, Liyi Chen, Chuanchen Luo, Xucheng Yin, Qing Li, Zhaoxiang Zhang

TL;DR

MarketGen addresses the lack of scalable benchmarks in complex commercial environments by delivering an auto-generated supermarket simulation built on an agent-based PCG framework and a comprehensive 3D asset library. It unifies automated scene construction with a two-track benchmark (Checkout Unloading and In-Aisle Item Collection) and a modular manipulation system that leverages visual prompting and planning, testing their limits on long-horizon tasks. Experiments show high fidelity scene generation and highlight the challenges of current modular policies, while real-world tests indicate promising sim-to-real transfer. Overall, MarketGen offers a practical, end-to-end platform to accelerate embodied AI research in real-world commercial settings and to bridge the sim-to-real gap.

Abstract

The development of embodied agents for complex commercial environments is hindered by a critical gap in existing robotics datasets and benchmarks, which primarily focus on household or tabletop settings with short-horizon tasks. To address this limitation, we introduce MarketGen, a scalable simulation platform with automatic scene generation for complex supermarket environments. MarketGen features a novel agent-based Procedural Content Generation (PCG) framework. It uniquely supports multi-modal inputs (text and reference images) and integrates real-world design principles to automatically generate complete, structured, and realistic supermarkets. We also provide an extensive and diverse 3D asset library with a total of 1100+ supermarket goods and parameterized facilities assets. Building on this generative foundation, we propose a novel benchmark for assessing supermarket agents, featuring two daily tasks in a supermarket: (1) Checkout Unloading: long-horizon tabletop tasks for cashier agents, and (2) In-Aisle Item Collection: complex mobile manipulation tasks for salesperson agents. We validate our platform and benchmark through extensive experiments, including the deployment of a modular agent system and successful sim-to-real transfer. MarketGen provides a comprehensive framework to accelerate research in embodied AI for complex commercial applications.

MarketGen: A Scalable Simulation Platform with Auto-Generated Embodied Supermarket Environments

TL;DR

MarketGen addresses the lack of scalable benchmarks in complex commercial environments by delivering an auto-generated supermarket simulation built on an agent-based PCG framework and a comprehensive 3D asset library. It unifies automated scene construction with a two-track benchmark (Checkout Unloading and In-Aisle Item Collection) and a modular manipulation system that leverages visual prompting and planning, testing their limits on long-horizon tasks. Experiments show high fidelity scene generation and highlight the challenges of current modular policies, while real-world tests indicate promising sim-to-real transfer. Overall, MarketGen offers a practical, end-to-end platform to accelerate embodied AI research in real-world commercial settings and to bridge the sim-to-real gap.

Abstract

The development of embodied agents for complex commercial environments is hindered by a critical gap in existing robotics datasets and benchmarks, which primarily focus on household or tabletop settings with short-horizon tasks. To address this limitation, we introduce MarketGen, a scalable simulation platform with automatic scene generation for complex supermarket environments. MarketGen features a novel agent-based Procedural Content Generation (PCG) framework. It uniquely supports multi-modal inputs (text and reference images) and integrates real-world design principles to automatically generate complete, structured, and realistic supermarkets. We also provide an extensive and diverse 3D asset library with a total of 1100+ supermarket goods and parameterized facilities assets. Building on this generative foundation, we propose a novel benchmark for assessing supermarket agents, featuring two daily tasks in a supermarket: (1) Checkout Unloading: long-horizon tabletop tasks for cashier agents, and (2) In-Aisle Item Collection: complex mobile manipulation tasks for salesperson agents. We validate our platform and benchmark through extensive experiments, including the deployment of a modular agent system and successful sim-to-real transfer. MarketGen provides a comprehensive framework to accelerate research in embodied AI for complex commercial applications.

Paper Structure

This paper contains 15 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Overview of MarketGen. MarketGen features as a scalable simulation platform with auto-generated scenes for supermarket scenarios. It differs from previous platforms and methods: (b) Handcrafted supermarket scenes in GRUtopia wang2024grutopia, (c) Tabletop task generation gao2025genmanip, and (d) Rule-based infinigen2024indoorsprocthorzhang2025agentworld and LLM-based household scene generation infiniteworldYang_2024_CVPR.
  • Figure 2: The Pipeline of Automatic Scene Generation. The agent system first generates a structured spatial layout and semantic info from the input text and reference image. Then the PCG workflow will finish scene construction.
  • Figure 3: PCG Workflow with parameterized facilities.
  • Figure 4: An overview of our benchmark. There are two tracks: Checkout Unloading for tabletop manipulation tasks and In-Aisle Item Collection for mobile manipulation tasks.
  • Figure 5: Results from our automatic scene generation pipeline. We can achieve a level of fidelity and logical coherence comparable to handcrafted scenes in GRUtopia.
  • ...and 2 more figures