Table of Contents
Fetching ...

Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research

Krishna Murthy Jatavallabhula, Edward Smith, Jean-Francois Lafleche, Clement Fuji Tsang, Artem Rozantsev, Wenzheng Chen, Tommy Xiang, Rev Lebaredian, Sanja Fidler

TL;DR

<3-5 sentence high-level summary> Kaolin addresses the fragmentation in 3D deep learning by providing an integrated PyTorch-based library that covers data loading, multi-representation support, differentiable rendering, metrics, visualization, and a pretrained-model ecosystem. Its modular architecture includes a flexible DifferentiableRenderer, differentiable geometry operations, and broad dataset compatibility, enabling rapid prototyping across meshes, point clouds, voxels, SDFs, and RGB-D data. The paper highlights performance-oriented implementations and a rich model zoo to standardize benchmarks and accelerate research in 3D tasks such as reconstruction, segmentation, and reasoning under 2D supervision. The work aims to lower entry barriers, promote reproducibility, and foster community contributions through open-source development and extensible tooling.

Abstract

We present Kaolin, a PyTorch library aiming to accelerate 3D deep learning research. Kaolin provides efficient implementations of differentiable 3D modules for use in deep learning systems. With functionality to load and preprocess several popular 3D datasets, and native functions to manipulate meshes, pointclouds, signed distance functions, and voxel grids, Kaolin mitigates the need to write wasteful boilerplate code. Kaolin packages together several differentiable graphics modules including rendering, lighting, shading, and view warping. Kaolin also supports an array of loss functions and evaluation metrics for seamless evaluation and provides visualization functionality to render the 3D results. Importantly, we curate a comprehensive model zoo comprising many state-of-the-art 3D deep learning architectures, to serve as a starting point for future research endeavours. Kaolin is available as open-source software at https://github.com/NVIDIAGameWorks/kaolin/.

Kaolin: A PyTorch Library for Accelerating 3D Deep Learning Research

TL;DR

<3-5 sentence high-level summary> Kaolin addresses the fragmentation in 3D deep learning by providing an integrated PyTorch-based library that covers data loading, multi-representation support, differentiable rendering, metrics, visualization, and a pretrained-model ecosystem. Its modular architecture includes a flexible DifferentiableRenderer, differentiable geometry operations, and broad dataset compatibility, enabling rapid prototyping across meshes, point clouds, voxels, SDFs, and RGB-D data. The paper highlights performance-oriented implementations and a rich model zoo to standardize benchmarks and accelerate research in 3D tasks such as reconstruction, segmentation, and reasoning under 2D supervision. The work aims to lower entry barriers, promote reproducibility, and foster community contributions through open-source development and extensible tooling.

Abstract

We present Kaolin, a PyTorch library aiming to accelerate 3D deep learning research. Kaolin provides efficient implementations of differentiable 3D modules for use in deep learning systems. With functionality to load and preprocess several popular 3D datasets, and native functions to manipulate meshes, pointclouds, signed distance functions, and voxel grids, Kaolin mitigates the need to write wasteful boilerplate code. Kaolin packages together several differentiable graphics modules including rendering, lighting, shading, and view warping. Kaolin also supports an array of loss functions and evaluation metrics for seamless evaluation and provides visualization functionality to render the 3D results. Importantly, we curate a comprehensive model zoo comprising many state-of-the-art 3D deep learning architectures, to serve as a starting point for future research endeavours. Kaolin is available as open-source software at https://github.com/NVIDIAGameWorks/kaolin/.

Paper Structure

This paper contains 10 sections, 5 figures, 2 tables.

Figures (5)

  • Figure 1: Kaolin is a PyTorch library aiming to accelerate 3D deep learning research. Kaolin provides 1) functionality to load and preprocess popular 3D datasets, 2) a large model zoo of commonly used neural architectures and loss functions for 3D tasks on pointclouds, meshes, voxelgrids, signed distance functions, and RGB-D images, 3) implements several existing differentiable renderers and supports several shaders in a modular way, 4) features most of the common 3D metrics for easy evaluation of research results, 5) functionality to visualize 3D results. Functions in Kaolin are highly optimized with significant speed-ups over existing 3D DL research code.
  • Figure 2: Kaolin makes training 3D DL models simple. We provide an illustration of the code required to train and test a PointNet++ classifier for car vs airplane in 5 lines of code.
  • Figure 3: Kaolin provides efficient PyTorch operations for converting across 3D representations. While meshes, pointclouds, and voxel grids continue to be the most popular 3D representations, Kaolin has extensive support for signed distance functions (SDFs), orthographic depth maps (ODMs), and RGB-D images.
  • Figure 4: Modular differentiable renderer: Kaolin hosts a flexible, modular differentiable renderer that allows for easy swapping of individual sub-operation, to compose new variations.
  • Figure 5: Applications of Kaolin: (Clockwise from top-left) 3D object prediction with 2D supervision dib, 3D content creation with generative adversarial networks 3DIWGAN, 3D segmentation meshcnn, automatically tagging 3D assets from TurboSquid turbosquid, 3D object prediction with 3D supervision GEOMetrics, and a lot more...