MeMo: Meaningful, Modular Controllers via Noise Injection

Megan Tjandrasuwita; Jie Xu; Armando Solar-Lezama; Wojciech Matusik

MeMo: Meaningful, Modular Controllers via Noise Injection

Megan Tjandrasuwita, Jie Xu, Armando Solar-Lezama, Wojciech Matusik

TL;DR

MeMo proposes a modular control framework that learns assembly-specific modules coordinated by a boss controller, trained with a novel modularity objective and noise-injection to encourage robust, low-dimensional coordination. By pretraining modules on a single robot and reusing them for morphologies with the same assemblies, MeMo reduces training time for structure and task transfer, outperforming or matching strong baselines like NerveNet and MetaMorph in various locomotion and grasping scenarios. The approach combines behavior cloning with an invariance-to-noise objective, and the authors provide ablations showing the critical role of noise injection in enabling positive transfer. The work highlights interpretable, assembly-aligned control representations and suggests paths toward broader transfer, multi-robot settings, and real-world deployment, while noting sim-to-real and methodological limitations as areas for future work.

Abstract

Robots are often built from standardized assemblies, (e.g. arms, legs, or fingers), but each robot must be trained from scratch to control all the actuators of all the parts together. In this paper we demonstrate a new approach that takes a single robot and its controller as input and produces a set of modular controllers for each of these assemblies such that when a new robot is built from the same parts, its control can be quickly learned by reusing the modular controllers. We achieve this with a framework called MeMo which learns (Me)aningful, (Mo)dular controllers. Specifically, we propose a novel modularity objective to learn an appropriate division of labor among the modules. We demonstrate that this objective can be optimized simultaneously with standard behavior cloning loss via noise injection. We benchmark our framework in locomotion and grasping environments on simple to complex robot morphology transfer. We also show that the modules help in task transfer. On both structure and task transfer, MeMo achieves improved training efficiency to graph neural network and Transformer baselines.

MeMo: Meaningful, Modular Controllers via Noise Injection

TL;DR

Abstract

Paper Structure (26 sections, 12 equations, 21 figures, 6 tables)

This paper contains 26 sections, 12 equations, 21 figures, 6 tables.

Introduction
Motivation for Modularity Objectives
Method
Objective
Modular Architecture and Training Pipeline
Related Work
Experiments
Transfer Learning
Results
Ablation Study
Analysis
Conclusion
Appendix
Limitations and Future Work
Broader Impact
...and 11 more sections

Figures (21)

Figure 1: Graph Structure and Neural Network Modules of the 6 Leg Centipede. Left: The robot's joints are labeled numerically and circled. Right: The joints form the nodes and the links are the edges. The subset of joints that form each leg module are circled in red, while those that comprise each body module are circled in blue. Neural network modules are denoted as $\textbf{W}_k^i$, where $k$ refers to the type, e.g. all leg modules are type 0, and $i$ denotes different instances of the same module type.
Figure 2: Effect of Modularity Objectives. Consider a module with 5 actuators, denoted in orange, trained to push a lever clockwise. As the state of the lever is a function of its angle $\theta$, a module trained by MeMo represents the control signals as a one-dimensional manifold with respect to B's signal. When noise is added to B's signal, the outputted actions remain on the manifold. Without MeMo, perturbations to B's signals cause deviations from the high reward trajectory.
Figure 3: Noise Injection Error. Over the course of training, we compute ratio = $|\mathcal{L}_p| / (\mathcal{L}_1 + \mathcal{L}_2)$ where $|\mathcal{L}_p|$ is the magnitude of the mean product term over the minibatch and $\mathcal{L}_1$ and $\mathcal{L}_2$ are the mean behavior cloning and invariance to noise losses. We compute training statistics over 5 runs and indicate standard deviation by shaded areas. (Left)-(Right): For all starting morphologies, the modularity objectives dominate the loss as the ratio is less than 1 for all updates.
Figure 4: Training Pipeline Overview. In Phase 1, we first train an expert controller for the training robot using RL. In Phase 2, we pretrain modules with noise injection during imitation learning. In Phase 3, we transfer the modules to a different context and retrain the boss controller $\textbf{B}\xspace_3$.
Figure 5: Structure Transfer Tasks.Left: Transfer "leg" and "body" modules from a 6 to a 12 leg centipede. Left Middle: Transfer "body" and "head" modules from a 6 to a 10 leg worm. Right Middle: Transfer "leg," "head," and "body" modules from a 6 to a 10 leg hybrid. Right: Transfer "arm" and "finger" modules from a 4 to a 5 finger claw.
...and 16 more figures

Theorems & Definitions (3)

Definition 2.1
Definition 2.2
Definition 3.1

MeMo: Meaningful, Modular Controllers via Noise Injection

TL;DR

Abstract

MeMo: Meaningful, Modular Controllers via Noise Injection

Authors

TL;DR

Abstract

Table of Contents

Figures (21)

Theorems & Definitions (3)