ASID: Active Exploration for System Identification in Robotic Manipulation

Marius Memmel; Andrew Wagenmaker; Chuning Zhu; Patrick Yin; Dieter Fox; Abhishek Gupta

ASID: Active Exploration for System Identification in Robotic Manipulation

Marius Memmel, Andrew Wagenmaker, Chuning Zhu, Patrick Yin, Dieter Fox, Abhishek Gupta

TL;DR

ASID addresses the sample-inefficiency of real-world reinforcement learning by architecting a targeted sim-to-real pipeline. It uses Fisher information maximization to drive informative real-world exploration, refines a simulator with the collected data via system identification, and trains downstream policies in a high-fidelity simulator for zero-shot real deployment. The approach demonstrates effective identification of unknown physical and geometric parameters across sphere manipulation, rod balancing, and articulation tasks, with minimal real data, including real-world rod balancing and shuffleboard. This work bridges classical system identification with modern sim-to-real methods, offering a principled, data-efficient path to robust robotic manipulation in real environments.

Abstract

Model-free control strategies such as reinforcement learning have shown the ability to learn control strategies without requiring an accurate model or simulator of the world. While this is appealing due to the lack of modeling requirements, such methods can be sample inefficient, making them impractical in many real-world domains. On the other hand, model-based control techniques leveraging accurate simulators can circumvent these challenges and use a large amount of cheap simulation data to learn controllers that can effectively transfer to the real world. The challenge with such model-based techniques is the requirement for an extremely accurate simulation, requiring both the specification of appropriate simulation assets and physical parameters. This requires considerable human effort to design for every environment being considered. In this work, we propose a learning system that can leverage a small amount of real-world data to autonomously refine a simulation model and then plan an accurate control strategy that can be deployed in the real world. Our approach critically relies on utilizing an initial (possibly inaccurate) simulator to design effective exploration policies that, when deployed in the real world, collect high-quality data. We demonstrate the efficacy of this paradigm in identifying articulation, mass, and other physical parameters in several challenging robotic manipulation tasks, and illustrate that only a small amount of real-world data can allow for effective sim-to-real transfer. Project website at https://weirdlabuw.github.io/asid

ASID: Active Exploration for System Identification in Robotic Manipulation

TL;DR

Abstract

Paper Structure (29 sections, 9 equations, 13 figures, 3 tables)

This paper contains 29 sections, 9 equations, 13 figures, 3 tables.

Introduction
Related Work
Preliminaries
Parameter Estimation and Fisher Information:
Asid: Targeted Exploration for Test-Time Simulation Construction, Identification, and Policy Optimization
Exploration via Fisher Information Maximization
Implementing Fisher Information Maximization:
System Identification
Solving the Downstream Task
Experimental Evaluation
Ablations and Baseline Comparisons
Simulated Task Descriptions
Does Asid learn effective exploration behavior?
How does Asid perform quantitatively in simulation on downstream tasks?
Does Asid allow for real-world controller synthesis using minimal real-world data?
...and 14 more sections

Figures (13)

Figure 1: Asid: A depiction of our proposed process of active exploration for system identification, from learning exploration policies to real-world deployment.
Figure 2: Overview of Asid: (1) Train an exploration policy $\pi_{\mathrm{exp}}$ that maximizes the Fisher information, leveraging the vast amount of cheap simulation data. (2) Roll out $\pi_{\mathrm{exp}}$ in real to collect informative data that can be used to (3) run system identification to identify physics parameters and reconstruct, e.g., geometric, collision, and kinematic properties. (4) Train a task-specific policy $\pi_{\mathrm{task}}$ in the updated simulator and (5) zero-shot transfer $\pi_{\mathrm{task}}$ to the real world.
Figure 3: Depiction of environments in simulation
Figure 4: Visitation frequency of the sphere when explored by different exploration policies on multi-friction (\ref{['fig:c']}). Asid activates the sphere over a much larger area, thereby identifying parameters more accurately
Figure 5: Real-world Rod Balancing: Simulation setup for training exploration and downstream task policies (left). Successful execution of autonomous real-world rod balancing with skewed mass (right).
...and 8 more figures

ASID: Active Exploration for System Identification in Robotic Manipulation

TL;DR

Abstract

ASID: Active Exploration for System Identification in Robotic Manipulation

Authors

TL;DR

Abstract

Table of Contents

Figures (13)