Table of Contents
Fetching ...

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

Yunsong Zhou, Hangxu Liu, Xuekun Jiang, Xing Shen, Yuanzhen Zhou, Hui Wang, Baole Fang, Yang Tian, Mulin Yu, Qiaojun Yu, Li Ma, Hengjie Li, Hanqing Wang, Jia Zeng, Jiangmiao Pang

Abstract

Robotic manipulation with deformable objects represents a data-intensive regime in embodied learning, where shape, contact, and topology co-evolve in ways that far exceed the variability of rigids. Although simulation promises relief from the cost of real-world data acquisition, prevailing sim-to-real pipelines remain rooted in rigid-body abstractions, producing mismatched geometry, fragile soft dynamics, and motion primitives poorly suited for cloth interaction. We posit that simulation fails not for being synthetic, but for being ungrounded. To address this, we introduce SIM1, a physics-aligned real-to-sim-to-real data engine that grounds simulation in the physical world. Given limited demonstrations, the system digitizes scenes into metric-consistent twins, calibrates deformable dynamics through elastic modeling, and expands behaviors via diffusion-based trajectory generation with quality filtering. This pipeline transforms sparse observations into scaled synthetic supervision with near-demonstration fidelity. Experiments show that policies trained on purely synthetic data achieve parity with real-data baselines at a 1:15 equivalence ratio, while delivering 90% zero-shot success and 50% generalization gains in real-world deployment. These results validate physics-aligned simulation as scalable supervision for deformable manipulation and a practical pathway for data-efficient policy learning.

SIM1: Physics-Aligned Simulator as Zero-Shot Data Scaler in Deformable Worlds

Abstract

Robotic manipulation with deformable objects represents a data-intensive regime in embodied learning, where shape, contact, and topology co-evolve in ways that far exceed the variability of rigids. Although simulation promises relief from the cost of real-world data acquisition, prevailing sim-to-real pipelines remain rooted in rigid-body abstractions, producing mismatched geometry, fragile soft dynamics, and motion primitives poorly suited for cloth interaction. We posit that simulation fails not for being synthetic, but for being ungrounded. To address this, we introduce SIM1, a physics-aligned real-to-sim-to-real data engine that grounds simulation in the physical world. Given limited demonstrations, the system digitizes scenes into metric-consistent twins, calibrates deformable dynamics through elastic modeling, and expands behaviors via diffusion-based trajectory generation with quality filtering. This pipeline transforms sparse observations into scaled synthetic supervision with near-demonstration fidelity. Experiments show that policies trained on purely synthetic data achieve parity with real-data baselines at a 1:15 equivalence ratio, while delivering 90% zero-shot success and 50% generalization gains in real-world deployment. These results validate physics-aligned simulation as scalable supervision for deformable manipulation and a practical pathway for data-efficient policy learning.

Paper Structure

This paper contains 27 sections, 12 equations, 14 figures, 4 tables, 1 algorithm.

Figures (14)

  • Figure 1: SIM1 pioneers real-to-sim-to-real data generation for deformable manipulation. It constructs simulation data whose deployment behavior is the same one as reality, enabling zero-shot transfer and scalable performance on physical robots.
  • Figure 2: Framework of SIM1.(1) Real-world objects are reconstructed into metric-accurate, textured simulation assets; (2) They are then executed within a deformation-stable simulation framework calibrated through real-to-sim behavior matching. (3) Upon physical alignment, diverse manipulation trajectories are synthesized via structured subtask decomposition and diffusion-based motion generation, and rendered with appearance randomization to produce real-equivalent synthetic training data.
  • Figure 3: Paradigm of deformation-stable physics simulation.(a) After naive VBD 10.1145/3658179 updates under external forces, edge deformation is monitored and virtual elastic constraints are activated when stretch exceeds a threshold, injecting strain forces that accelerate convergence toward physically plausible cloth configurations. (b) A bidirectionally synchronized simulation infrastructure replaces identical dual-arm executions in simulation and aligns deformation behaviors through visual calibration.
  • Figure 4: Illustration of data collection and evaluation.(a) Real-world and simulated data collection via kinesthetic teaching and isomorphic teleoperation on Arx ACONE and Arx X5. (b) Domain settings for in-domain and out-of-domain evaluation in real-world experiments. Representative long-horizon T-shirt folding task (over 20 seconds) illustrating complex sequential manipulation capabilities.
  • Figure 5: Illustration of assets used in data generation. Scanned deformable assets and open-source environmental assets used in simulation (top-left). Diverse garment textures for appearance variation (top-right). Room-scale environments with randomized layouts and lighting for scene-level randomization (bottom).
  • ...and 9 more figures