Table of Contents
Fetching ...

RealAppliance: Let High-fidelity Appliance Assets Controllable and Workable as Aligned Real Manuals

Yuzheng Gao, Yuxing Long, Lei Kang, Yuchong Guo, Ziyan Yu, Shangqing Mao, Jiyao Zhang, Ruihai Wu, Dongjiang Li, Hui Shen, Hao Dong

TL;DR

This work introduces RealAppliance, a dataset of 100 high-fidelity appliance assets whose physical, electronic, and program logic are aligned with real manuals, addressing prior gaps in realism and manual-grounded operation. It additionally presents RealAppliance-Bench, a multimodal and embodied planning benchmark spanning manual understanding, part grounding, open-loop planning, and closed-loop adjustment. Through extensive evaluations of state-of-the-art multimodal large language models and embodied planners, the authors reveal significant challenges in manual grounding and plan robustness, while highlighting the need for enhanced document understanding and fine-grained visual reasoning. The RealAppliance platform offers a realistic testbed for advancing appliance manipulation planning and has potential for broader use in data collection for low-level manipulation policies and standardized benchmarks.

Abstract

Existing appliance assets suffer from poor rendering, incomplete mechanisms, and misalignment with manuals, leading to simulation-reality gaps that hinder appliance manipulation development. In this work, we introduce the RealAppliance dataset, comprising 100 high-fidelity appliances with complete physical, electronic mechanisms, and program logic aligned with their manuals. Based on these assets, we propose the RealAppliance-Bench benchmark, which evaluates multimodal large language models and embodied manipulation planning models across key tasks in appliance manipulation planning: manual page retrieval, appliance part grounding, open-loop manipulation planning, and closed-loop planning adjustment. Our analysis of model performances on RealAppliance-Bench provides insights for advancing appliance manipulation research

RealAppliance: Let High-fidelity Appliance Assets Controllable and Workable as Aligned Real Manuals

TL;DR

This work introduces RealAppliance, a dataset of 100 high-fidelity appliance assets whose physical, electronic, and program logic are aligned with real manuals, addressing prior gaps in realism and manual-grounded operation. It additionally presents RealAppliance-Bench, a multimodal and embodied planning benchmark spanning manual understanding, part grounding, open-loop planning, and closed-loop adjustment. Through extensive evaluations of state-of-the-art multimodal large language models and embodied planners, the authors reveal significant challenges in manual grounding and plan robustness, while highlighting the need for enhanced document understanding and fine-grained visual reasoning. The RealAppliance platform offers a realistic testbed for advancing appliance manipulation planning and has potential for broader use in data collection for low-level manipulation policies and standardized benchmarks.

Abstract

Existing appliance assets suffer from poor rendering, incomplete mechanisms, and misalignment with manuals, leading to simulation-reality gaps that hinder appliance manipulation development. In this work, we introduce the RealAppliance dataset, comprising 100 high-fidelity appliances with complete physical, electronic mechanisms, and program logic aligned with their manuals. Based on these assets, we propose the RealAppliance-Bench benchmark, which evaluates multimodal large language models and embodied manipulation planning models across key tasks in appliance manipulation planning: manual page retrieval, appliance part grounding, open-loop manipulation planning, and closed-loop planning adjustment. Our analysis of model performances on RealAppliance-Bench provides insights for advancing appliance manipulation research

Paper Structure

This paper contains 16 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Overview of RealAppliance. We collect 100 real appliance manuals and create 100 high-fidelity appliances digital assets aligned with these manuals. Every appliance asset has the same size, texture and physical mechanisms, electronic mechanisms as the real one.
  • Figure 2: Creation process of appliance digital assets in RealAppliance.
  • Figure 3: Evaluation tasks in RealAppliance-Bench. These tasks cover the essential capabilities in appliance manipulation planning.
  • Figure 4: Statistics visualization of RealAppliance-Bench. (Left) We display the proportion of various types of appliances via pie chart. (Middle) We conducted a user study about the fidelity of appliance asset(Right) We visualize the atomic action transitions in task planning to demonstrate planning diversity.
  • Figure 5: Failure case study on RealAppliance-Bench.