Table of Contents
Fetching ...

Synchronize Dual Hands for Physics-Based Dexterous Guitar Playing

Pei Xu, Ruocheng Wang

TL;DR

The virtual guitarist trained by the proposed approach can synthesize motions from unstructured reference data of general guitar-playing practice motions, and accurately play diverse rhythms with complex chord pressing and string picking patterns based on the input guitar tabs that do not exist in the references.

Abstract

We present a novel approach to synthesize dexterous motions for physically simulated hands in tasks that require coordination between the control of two hands with high temporal precision. Instead of directly learning a joint policy to control two hands, our approach performs bimanual control through cooperative learning where each hand is treated as an individual agent. The individual policies for each hand are first trained separately, and then synchronized through latent space manipulation in a centralized environment to serve as a joint policy for two-hand control. By doing so, we avoid directly performing policy learning in the joint state-action space of two hands with higher dimensions, greatly improving the overall training efficiency. We demonstrate the effectiveness of our proposed approach in the challenging guitar-playing task. The virtual guitarist trained by our approach can synthesize motions from unstructured reference data of general guitar-playing practice motions, and accurately play diverse rhythms with complex chord pressing and string picking patterns based on the input guitar tabs that do not exist in the references. Along with this paper, we provide the motion capture data that we collected as the reference for policy training. Code is available at: https://pei-xu.github.io/guitar.

Synchronize Dual Hands for Physics-Based Dexterous Guitar Playing

TL;DR

The virtual guitarist trained by the proposed approach can synthesize motions from unstructured reference data of general guitar-playing practice motions, and accurately play diverse rhythms with complex chord pressing and string picking patterns based on the input guitar tabs that do not exist in the references.

Abstract

We present a novel approach to synthesize dexterous motions for physically simulated hands in tasks that require coordination between the control of two hands with high temporal precision. Instead of directly learning a joint policy to control two hands, our approach performs bimanual control through cooperative learning where each hand is treated as an individual agent. The individual policies for each hand are first trained separately, and then synchronized through latent space manipulation in a centralized environment to serve as a joint policy for two-hand control. By doing so, we avoid directly performing policy learning in the joint state-action space of two hands with higher dimensions, greatly improving the overall training efficiency. We demonstrate the effectiveness of our proposed approach in the challenging guitar-playing task. The virtual guitarist trained by our approach can synthesize motions from unstructured reference data of general guitar-playing practice motions, and accurately play diverse rhythms with complex chord pressing and string picking patterns based on the input guitar tabs that do not exist in the references. Along with this paper, we provide the motion capture data that we collected as the reference for policy training. Code is available at: https://pei-xu.github.io/guitar.
Paper Structure (21 sections, 16 equations, 16 figures, 1 table)

This paper contains 21 sections, 16 equations, 16 figures, 1 table.

Figures (16)

  • Figure 1: Overview of the proposed system synchronizing dual policies for dexterous guitar playing with two hands. Our system performs two-hand policy training in two steps. First, we decentralize the control of two hands, and train the left-hand policy for fret pressing (orange box) and the right-hand policy for string picking (blue box) independently in a decentralized manner. Then, we lock the previously trained single-hand policies, and introduce a synchronizer to coordinate the behaviors of single-hand policies in a centralized training environment to obtain a joint policy for two-hand control. The synchronization is achieved quickly by modifying single-hand policies' behavior patterns through latent space manipulation.
  • Figure 2: Profile of the simulated guitar and hand models. Our guitar model is left-handed. It has six strings with twenty-four frets and is played with a pick. Given a common guitar chord spanning at most four frets plus two additional conditions -- open string (being picked without fret pressing) and mute string (not being picked to generate any sound) -- for each string, there are nearly 900,000 chord combinations theoretically, which makes it difficult to be fully mastered. Each of our hand models has 16 links with 27 degrees of freedom (DoFs), where the wrist joint has 6 DoFs, the MCP joints have 2 DoFs with the exception that the thumb MCP having 3, and all the other finger joints have 1 DoF. The simulated guitar has a scale length of 24.5 inches (around 0.62m) in our implementation. In this work, we only consider the control of hands and assume that the guitar is fixed in the 3D space. We can obtain right-handed guitar-playing motions by mirroring the setup of the hand and guitar.
  • Figure 3: Demonstrations of finger availability for fret pressing. We allow multiple fingers to press at one fret but do not allow fingers to cross over. Each finger can press at no more than one fret, and at most four frets could be pressed simultaneously. Here we show the four possible cases with the one (right) to four (left) target frets that need to be pressed. Target frets and the corresponding available fingers that can be used to press at the target frets are labeled using the same number.
  • Figure 4: Distribution of F1 scores achieved by the left-hand policy when playing chords. The test set contains 50 music tracks with 1721 unrepeated measures having 4859 chords.
  • Figure 5: F1 scores of bimanual performance on a test set of 25 music tracks before and after synchronization. We show both the mean values and the standard deviations of F1 scores evaluated on the level of music notes.
  • ...and 11 more figures