Table of Contents
Fetching ...

Generating Realistic Synthetic Head Rotation Data for Extended Reality using Deep Learning

Jakob Struye, Filip Lemic, Jeroen Famaey

TL;DR

The paper tackles the problem of scarce, realistic head-rotation data for XR by introducing a TimeGAN-based time-series generator that learns temporal dependencies from existing datasets. It presents a complete methodology, from data preparation to model tuning, and introduces interpretable, domain-specific metrics to evaluate realism beyond generic statistical measures. Empirical results across multiple datasets show that TimeGAN-generated head-rotation data closely matches real data in orientation, motion, and cross-axis correlations, outperforming Fourier-based baselines. The approach enables scalable data augmentation for XR applications like proactive viewport encoding and millimeter-wave beamforming, with plans for open-source release and future work to address very-low-motion underrepresentation.

Abstract

Extended Reality is a revolutionary method of delivering multimedia content to users. A large contributor to its popularity is the sense of immersion and interactivity enabled by having real-world motion reflected in the virtual experience accurately and immediately. This user motion, mainly caused by head rotations, induces several technical challenges. For instance, which content is generated and transmitted depends heavily on where the user is looking. Seamless systems, taking user motion into account proactively, will therefore require accurate predictions of upcoming rotations. Training and evaluating such predictors requires vast amounts of orientational input data, which is expensive to gather, as it requires human test subjects. A more feasible approach is to gather a modest dataset through test subjects, and then extend it to a more sizeable set using synthetic data generation methods. In this work, we present a head rotation time series generator based on TimeGAN, an extension of the well-known Generative Adversarial Network, designed specifically for generating time series. This approach is able to extend a dataset of head rotations with new samples closely matching the distribution of the measured time series.

Generating Realistic Synthetic Head Rotation Data for Extended Reality using Deep Learning

TL;DR

The paper tackles the problem of scarce, realistic head-rotation data for XR by introducing a TimeGAN-based time-series generator that learns temporal dependencies from existing datasets. It presents a complete methodology, from data preparation to model tuning, and introduces interpretable, domain-specific metrics to evaluate realism beyond generic statistical measures. Empirical results across multiple datasets show that TimeGAN-generated head-rotation data closely matches real data in orientation, motion, and cross-axis correlations, outperforming Fourier-based baselines. The approach enables scalable data augmentation for XR applications like proactive viewport encoding and millimeter-wave beamforming, with plans for open-source release and future work to address very-low-motion underrepresentation.

Abstract

Extended Reality is a revolutionary method of delivering multimedia content to users. A large contributor to its popularity is the sense of immersion and interactivity enabled by having real-world motion reflected in the virtual experience accurately and immediately. This user motion, mainly caused by head rotations, induces several technical challenges. For instance, which content is generated and transmitted depends heavily on where the user is looking. Seamless systems, taking user motion into account proactively, will therefore require accurate predictions of upcoming rotations. Training and evaluating such predictors requires vast amounts of orientational input data, which is expensive to gather, as it requires human test subjects. A more feasible approach is to gather a modest dataset through test subjects, and then extend it to a more sizeable set using synthetic data generation methods. In this work, we present a head rotation time series generator based on TimeGAN, an extension of the well-known Generative Adversarial Network, designed specifically for generating time series. This approach is able to extend a dataset of head rotations with new samples closely matching the distribution of the measured time series.
Paper Structure (21 sections, 10 figures, 1 table)

This paper contains 21 sections, 10 figures, 1 table.

Figures (10)

  • Figure 1: Illustration of the TimeGAN training process, displaying neural networks (ellipses) and data (rectangles). Rectangles with rounded corners indicate data generated by the model, while those with sharp corners represent data taken from input sequences. Each line is part of training the latent space (orange), unsupervised adversarial training (green) or supervised training (blue). Solid lines are data inputs, dashed lines are data outputs and dotted lines are loss signals. Losses originating from two data sets are a dissimilarity measure (e.g., mean squared error) while the loss originating from the discriminator is a classification loss (e.g., cross-entropy).
  • Figure 2: Distribution (quantized) of yaw, pitch and roll within the original dataset. Pitch is, by definition, restricted to $[-90,90]$ degrees.
  • Figure 3: CDF of the absolute error between the original dataset, and the dataset first downsampled, then upsampled using a cubic spline interpolator. Errors over 0.5° are removed for clarity.
  • Figure 4: One sample extracted from the original dataset, after downsampling.
  • Figure 5: Distribution of yaw, pitch and roll values across all time steps of all samples
  • ...and 5 more figures