Table of Contents
Fetching ...

DexDrummer: In-Hand, Contact-Rich, and Long-Horizon Dexterous Robot Drumming

Hung-Chieh Fang, Amber Xie, Jennifer Grannen, Kenneth Llontop, Dorsa Sadigh

Abstract

Performing in-hand, contact-rich, and long-horizon dexterous manipulation remains an unsolved challenge in robotics. Prior hand dexterity works have considered each of these three challenges in isolation, yet do not combine these skills into a single, complex task. To further test the capabilities of dexterity, we propose drumming as a testbed for dexterous manipulation. Drumming naturally integrates all three challenges: it involves in-hand control for stabilizing and adjusting the drumstick with the fingers, contact-rich interaction through repeated striking of the drum surface, and long-horizon coordination when switching between drums and sustaining rhythmic play. We present DexDrummer, a hierarchical object-centric bimanual drumming policy trained in simulation with sim-to-real transfer. The framework reduces the exploration difficulty of pure reinforcement learning by combining trajectory planning with residual RL corrections for fast transitions between drums. A dexterous manipulation policy handles contact-rich dynamics, guided by rewards that explicitly model both finger-stick and stick-drum interactions. In simulation, we show our policy can play two styles of music: multi-drum, bimanual songs and challenging, technical exercises that require increased dexterity. Across simulated bimanual tasks, our dexterous, reactive policy outperforms a fixed grasp policy by 1.87x across easy songs and 1.22x across hard songs F1 scores. In real-world tasks, we show song performance across a multi-drum setup. DexDrummer is able to play our training song and its extended version with an F1 score of 1.0.

DexDrummer: In-Hand, Contact-Rich, and Long-Horizon Dexterous Robot Drumming

Abstract

Performing in-hand, contact-rich, and long-horizon dexterous manipulation remains an unsolved challenge in robotics. Prior hand dexterity works have considered each of these three challenges in isolation, yet do not combine these skills into a single, complex task. To further test the capabilities of dexterity, we propose drumming as a testbed for dexterous manipulation. Drumming naturally integrates all three challenges: it involves in-hand control for stabilizing and adjusting the drumstick with the fingers, contact-rich interaction through repeated striking of the drum surface, and long-horizon coordination when switching between drums and sustaining rhythmic play. We present DexDrummer, a hierarchical object-centric bimanual drumming policy trained in simulation with sim-to-real transfer. The framework reduces the exploration difficulty of pure reinforcement learning by combining trajectory planning with residual RL corrections for fast transitions between drums. A dexterous manipulation policy handles contact-rich dynamics, guided by rewards that explicitly model both finger-stick and stick-drum interactions. In simulation, we show our policy can play two styles of music: multi-drum, bimanual songs and challenging, technical exercises that require increased dexterity. Across simulated bimanual tasks, our dexterous, reactive policy outperforms a fixed grasp policy by 1.87x across easy songs and 1.22x across hard songs F1 scores. In real-world tasks, we show song performance across a multi-drum setup. DexDrummer is able to play our training song and its extended version with an F1 score of 1.0.
Paper Structure (25 sections, 8 figures, 2 tables)

This paper contains 25 sections, 8 figures, 2 tables.

Figures (8)

  • Figure 1: DexDrummer: A hierarchical dexterous drumming framework trained in simulation and deployed in the real world. Left: Training in simulation. Our policy decomposes the task into a hierarchical controller. A high-level policy generates parameterized motion primitives that produce drumstick trajectories from musical inputs. A low-level policy then uses residual RL to learn corrective arm and hand actions for accurate trajectory tracking during fast transitions. Middle: Contact-targeted rewards. To handle contact-rich dynamics, we design rewards that target two types of interactions. In-hand contacts encourage stable finger–stick manipulation through fingertip contact rewards, a fulcrum reward between the thumb and index finger, and an arm energy penalty that promotes finger-dominant control. External contacts address stick–drum interactions with a trajectory guidance reward to encourage drum strikes and a contact curriculum that gradually introduces drum contact during training. Right: Real-world drumming. Policies trained in simulation transfer zero-shot to the real robot, enabling dexterous multi-drum playing over long horizons.
  • Figure 2: Drumming Environments. Our first simulation environment includes bimanual, multi-drum song-playing. Our second environment involves unimanual, uni-drum control for a high-speed technical exercise. Finally, in the real-world, we play songs with a drum pad and cymbal.
  • Figure 3: Bimanual Song-Playing Rollout. We visualize 6 frames across a single song trajectory, with lighter colored drums and cymbals corresponding to a hit. Every song requires multiple combinations of drums to be hit.
  • Figure 4: Results for Dexterous Song-Playing.Left: Reactive grasp outperforms fixed grasp by a large margin in long-horizon contacts. Right: For more challenging songs requiring frequent drum-to-drum transitions, reactive grasp still improves performance, but with a smaller margin, primarily due to the reduced action space of fixed grasp.
  • Figure 5: Results for Finger-Driven Control.Left: As tempo (beats per minute) increases, trajectory error for finger-driven control decreases and the gap to arm-driven control widens, showing the superior dexterity of finger-driven control. Energy consumption is also substantially lower compared to arm-driven control. Right: Without contact-targeted rewards, finger-driven control struggles to manage contact interactions effectively.
  • ...and 3 more figures