Table of Contents
Fetching ...

MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition

Aggelina Chatziagapi, Grigorios G. Chrysos, Dimitris Samaras

TL;DR

MIGS tackles monocular multi‑identity human modeling by extending 3D Gaussian Splatting with a high‑order tensor that aggregates all per‑Gaussian parameters across identities. By applying CP tensor decomposition to the tensor of parameters, MIGS achieves a compact representation with significantly fewer learned parameters than training separate 3DGS models per identity, while enabling robust animation under novel poses. The approach jointly optimizes identity‑shared factors and identity‑specific deformation/color networks, with optional personalization and the ability to incorporate unseen identities via updating a single row of the factor matrices. Empirical results on ZJU‑MoCap and AIST++ show that MIGS outperforms recent NeRF and 3DGS baselines in unseen pose and view scenarios, demonstrating strong generalization and scalability for animatable digital humans.

Abstract

We introduce MIGS (Multi-Identity Gaussian Splatting), a novel method that learns a single neural representation for multiple identities, using only monocular videos. Recent 3D Gaussian Splatting (3DGS) approaches for human avatars require per-identity optimization. However, learning a multi-identity representation presents advantages in robustly animating humans under arbitrary poses. We propose to construct a high-order tensor that combines all the learnable 3DGS parameters for all the training identities. By assuming a low-rank structure and factorizing the tensor, we model the complex rigid and non-rigid deformations of multiple subjects in a unified network, significantly reducing the total number of parameters. Our proposed approach leverages information from all the training identities and enables robust animation under challenging unseen poses, outperforming existing approaches. It can also be extended to learn unseen identities.

MIGS: Multi-Identity Gaussian Splatting via Tensor Decomposition

TL;DR

MIGS tackles monocular multi‑identity human modeling by extending 3D Gaussian Splatting with a high‑order tensor that aggregates all per‑Gaussian parameters across identities. By applying CP tensor decomposition to the tensor of parameters, MIGS achieves a compact representation with significantly fewer learned parameters than training separate 3DGS models per identity, while enabling robust animation under novel poses. The approach jointly optimizes identity‑shared factors and identity‑specific deformation/color networks, with optional personalization and the ability to incorporate unseen identities via updating a single row of the factor matrices. Empirical results on ZJU‑MoCap and AIST++ show that MIGS outperforms recent NeRF and 3DGS baselines in unseen pose and view scenarios, demonstrating strong generalization and scalability for animatable digital humans.

Abstract

We introduce MIGS (Multi-Identity Gaussian Splatting), a novel method that learns a single neural representation for multiple identities, using only monocular videos. Recent 3D Gaussian Splatting (3DGS) approaches for human avatars require per-identity optimization. However, learning a multi-identity representation presents advantages in robustly animating humans under arbitrary poses. We propose to construct a high-order tensor that combines all the learnable 3DGS parameters for all the training identities. By assuming a low-rank structure and factorizing the tensor, we model the complex rigid and non-rigid deformations of multiple subjects in a unified network, significantly reducing the total number of parameters. Our proposed approach leverages information from all the training identities and enables robust animation under challenging unseen poses, outperforming existing approaches. It can also be extended to learn unseen identities.
Paper Structure (23 sections, 9 equations, 14 figures, 4 tables)

This paper contains 23 sections, 9 equations, 14 figures, 4 tables.

Figures (14)

  • Figure 1: We introduce MIGS (multi-identity Gaussian splatting) that learns a single neural representation for multiple identities based on tensor decomposition. MIGS enables robust animation of human avatars under novel poses, out of the training distribution.
  • Figure 2: Overview of MIGS. Given monocular videos of multiple identities, we learn a unified 3DGS representation for human avatars based on CP tensor decomposition. We construct a tensor $\bm{\mathcal{W}} \in \mathbb{R} ^ {N_i \times N_g \times M}$, where $N_i$ is the number of identities, $N_g$ the number of 3D Gaussians and $M$ the number of parameters per Gaussian. In practice, we assume a low-rank structure of the tensor $\bm{\mathcal{W}}$ and thus, we only learn the matrices $\bm{U}_1 \in \mathbb{R} ^{M \times R}$, $\bm{U}_2 \in \mathbb{R} ^{N_i \times R}$, $\bm{U}_3 \in \mathbb{R} ^{N_g \times R}$ that approximate $\bm{\mathcal{W}}$ through the CP decomposition with $R << N_g$. By leveraging information from the diverse deformations of different subjects, MIGS enables robust animation under novel challenging poses.
  • Figure 3: Animation of human avatars under novel out-of-distribution poses. Qualitative comparison with state-of-the-art approaches, namely HumanNeRF weng2022humannerf, MonoHuman yu2023monohuman, GauHuman hu2023gauhuman, and 3DGS-Avatar qian20233dgs. The training subjects are from the ZJU-MoCap dataset peng2021neural and the target poses (column 1) are from the AIST$++$ dataset li2021learnaist-dance-db. Since the target poses are out of the training distribution (first 3 come from the advanced dance videos of AIST$++$), animation under these poses is challenging. Our method demonstrates significant robustness.
  • Figure 4: Animation of human avatars under novel poses. Our method robustly animates all the identities under challenging unseen poses, from unseen camera views and advanced dance videos, outperforming the other methods, HumanNeRF weng2022humannerf, MonoHuman yu2023monohuman, and 3DGS-Avatar qian20233dgs. The subjects are from AIST++ li2021learnaist-dance-db.
  • Figure 5: Learning a novel identity. Our method can be extended to learn a novel identity, and then animate it under novel poses and novel views.
  • ...and 9 more figures