Table of Contents
Fetching ...

Subconscious Robotic Imitation Learning

Jun Xie, Zhicheng Wang, Jianwei Tan, Huanxu Lin, Xiaoguang Ma

TL;DR

The paper addresses the latency challenges in robotic imitation learning (RIL) by introducing Subconscious Robotic Imitation Learning (SRIL), which leverages subconscious downsampling, pattern-augmented learning, and cognitive offloading to accelerate policy inference. SRIL combines a SPAM-driven downsampling pipeline with a Transformer-based action predictor that fuses visual and state data, and it adds an execution strategy that uses exponential-weighted future-action blocks gated by a Cognitive Offloading Readiness (COR) metric. Experiments in six dual-arm simulations and three real-robot tasks show SRIL achieves 100–200% faster task execution with comparable or superior success rates compared to state-of-the-art policies like ACT and diffusion-based methods. The results indicate SRIL's potential to deliver real-time, high-accuracy manipulation in dynamic environments and industrial settings, with future work focusing on generalization, adaptive inference, and multi-task robustness.

Abstract

Although robotic imitation learning (RIL) is promising for embodied intelligent robots, existing RIL approaches rely on computationally intensive multi-model trajectory predictions, resulting in slow execution and limited real-time responsiveness. Instead, human beings subconscious can constantly process and store vast amounts of information from their experiences, perceptions, and learning, allowing them to fulfill complex actions such as riding a bike, without consciously thinking about each. Inspired by this phenomenon in action neurology, we introduced subconscious robotic imitation learning (SRIL), wherein cognitive offloading was combined with historical action chunkings to reduce delays caused by model inferences, thereby accelerating task execution. This process was further enhanced by subconscious downsampling and pattern augmented learning policy wherein intent-rich information was addressed with quantized sampling techniques to improve manipulation efficiency. Experimental results demonstrated that execution speeds of the SRIL were 100\% to 200\% faster over SOTA policies for comprehensive dual-arm tasks, with consistently higher success rates.

Subconscious Robotic Imitation Learning

TL;DR

The paper addresses the latency challenges in robotic imitation learning (RIL) by introducing Subconscious Robotic Imitation Learning (SRIL), which leverages subconscious downsampling, pattern-augmented learning, and cognitive offloading to accelerate policy inference. SRIL combines a SPAM-driven downsampling pipeline with a Transformer-based action predictor that fuses visual and state data, and it adds an execution strategy that uses exponential-weighted future-action blocks gated by a Cognitive Offloading Readiness (COR) metric. Experiments in six dual-arm simulations and three real-robot tasks show SRIL achieves 100–200% faster task execution with comparable or superior success rates compared to state-of-the-art policies like ACT and diffusion-based methods. The results indicate SRIL's potential to deliver real-time, high-accuracy manipulation in dynamic environments and industrial settings, with future work focusing on generalization, adaptive inference, and multi-task robustness.

Abstract

Although robotic imitation learning (RIL) is promising for embodied intelligent robots, existing RIL approaches rely on computationally intensive multi-model trajectory predictions, resulting in slow execution and limited real-time responsiveness. Instead, human beings subconscious can constantly process and store vast amounts of information from their experiences, perceptions, and learning, allowing them to fulfill complex actions such as riding a bike, without consciously thinking about each. Inspired by this phenomenon in action neurology, we introduced subconscious robotic imitation learning (SRIL), wherein cognitive offloading was combined with historical action chunkings to reduce delays caused by model inferences, thereby accelerating task execution. This process was further enhanced by subconscious downsampling and pattern augmented learning policy wherein intent-rich information was addressed with quantized sampling techniques to improve manipulation efficiency. Experimental results demonstrated that execution speeds of the SRIL were 100\% to 200\% faster over SOTA policies for comprehensive dual-arm tasks, with consistently higher success rates.
Paper Structure (18 sections, 15 equations, 6 figures, 2 tables)

This paper contains 18 sections, 15 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: Overview of the Subconscious Robotic Imitation Learning SRIL framework. Left: Original dense trajectory was subconsciously downsampled to retain key actions. Middle: A transformer-based pattern-augmented learning policy integrated visual observations and subconscious patterns. Right: The policy performed subconscioued imitation rate learning, skipping redundant actions to accelerate task execution. The SRIL highly speeded up execution reduced while preserving performance.
  • Figure 2: Subconscious Downsampling and Subconscious Pattern-Augmented Learning Policy. Left: The demonstration datasets were downsampled via subconscious pattern recognition to create subconscious downsampled datasets, which train the Pattern-Driven Action Prediction Model. Right: The model combines visual and joint data through ResNet encoders and a Transformer architecture for precise manipulation prediction.
  • Figure 3: Cognitive Offloading for Subconscious Imitation Learning: Each inference frame generates an action chunking sequence, triggering cognitive offloading when the COR exceeds the COT, enabling efficient subconscious imitation through action time integration.
  • Figure 4: Overview of the simulation tasks. Task 1 Cube Transfer: The right arm grasped (1.1) and lifted the red cube (1.2), then transferred it to the left arm (1.3). Task 2 Bimanual Stack: The right arm grasped and lifted the red cube (2.1), and the left arm grasped the blue cube (2.2). Then the right arm put the red cube in the middle of the table and the left arm put the blue cube on top of the red cube (2.3). Task 3 Bimanual Extraction: The left arm grasped the blue socket and the right arm grasped the red peg (3.1). Both dragged and extracted the inserted parts (3.2) and placed them on the table (3.3). Task 4 Bimanual Insertion: The left arm grasped the blue socket and the right arm grasped the red peg (4.1). Both lifted (4.2) and inserted the red peg into the blue socket (4.3). Task 5 Bimanual Restore: The left arm picked up the blue box and the right arm picked up the red cube (5.1). Then they lifted and placed the red cube into the blue box above the table (5.2). After that, the left arm put the box back (5.3). Task 6 Transfer and Restore: The right arm grasped (6.1) and transferred the red cube to the left arm (6.2). Then the left arm placed the red cube into the blue box (6.3).
  • Figure 5: Effects of the COT on Subconscious Robotic Imitation Learning (SRIL).
  • ...and 1 more figures