Table of Contents
Fetching ...

COSMOS: Coherent Supergaussian Modeling with Spatial Priors for Sparse-View 3D Splatting

Chaeyoung Jeong, Kwangsu Kim

TL;DR

Inspired by the concept of superpoints from 3D segmentation, COSMOS introduces 3D structure priors by newly defining supergaussian groupings of Gaussians based on local geometric cues and appearance features, enabling the integration of global and local spatial information.

Abstract

3D Gaussian Splatting (3DGS) has recently emerged as a promising approach for 3D reconstruction, providing explicit, point-based representations and enabling high-quality real time rendering. However, when trained with sparse input views, 3DGS suffers from overfitting and structural degradation, leading to poor generalization on novel views. This limitation arises from its optimization relying solely on photometric loss without incorporating any 3D structure priors. To address this issue, we propose Coherent supergaussian Modeling with Spatial Priors (COSMOS). Inspired by the concept of superpoints from 3D segmentation, COSMOS introduces 3D structure priors by newly defining supergaussian groupings of Gaussians based on local geometric cues and appearance features. To this end, COSMOS applies inter group global self-attention across supergaussian groups and sparse local attention among individual Gaussians, enabling the integration of global and local spatial information. These structure-aware features are then used for predicting Gaussian attributes, facilitating more consistent 3D reconstructions. Furthermore, by leveraging supergaussian-based grouping, COSMOS enforces an intra-group positional regularization to maintain structural coherence and suppress floaters, thereby enhancing training stability under sparse-view conditions. Our experiments on Blender and DTU show that COSMOS surpasses state-of-the-art methods in sparse-view settings without any external depth supervision.

COSMOS: Coherent Supergaussian Modeling with Spatial Priors for Sparse-View 3D Splatting

TL;DR

Inspired by the concept of superpoints from 3D segmentation, COSMOS introduces 3D structure priors by newly defining supergaussian groupings of Gaussians based on local geometric cues and appearance features, enabling the integration of global and local spatial information.

Abstract

3D Gaussian Splatting (3DGS) has recently emerged as a promising approach for 3D reconstruction, providing explicit, point-based representations and enabling high-quality real time rendering. However, when trained with sparse input views, 3DGS suffers from overfitting and structural degradation, leading to poor generalization on novel views. This limitation arises from its optimization relying solely on photometric loss without incorporating any 3D structure priors. To address this issue, we propose Coherent supergaussian Modeling with Spatial Priors (COSMOS). Inspired by the concept of superpoints from 3D segmentation, COSMOS introduces 3D structure priors by newly defining supergaussian groupings of Gaussians based on local geometric cues and appearance features. To this end, COSMOS applies inter group global self-attention across supergaussian groups and sparse local attention among individual Gaussians, enabling the integration of global and local spatial information. These structure-aware features are then used for predicting Gaussian attributes, facilitating more consistent 3D reconstructions. Furthermore, by leveraging supergaussian-based grouping, COSMOS enforces an intra-group positional regularization to maintain structural coherence and suppress floaters, thereby enhancing training stability under sparse-view conditions. Our experiments on Blender and DTU show that COSMOS surpasses state-of-the-art methods in sparse-view settings without any external depth supervision.
Paper Structure (22 sections, 13 equations, 5 figures, 4 tables)

This paper contains 22 sections, 13 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Comparison of the Optimization Process between 3DGS and COSMOS. While 3DGS learns each Gaussian independently to fit the training views, COSMOS leverages supergaussian grouping to inject spatial priors during optimization. Through inter‑group learning, it captures the global 3D structural context, while intra‑group regularization prevents Gaussians within similar geometric structures from diverging in different directions.
  • Figure 2: Architecture of COSMOS. To incorporate 3D priors, we initially group Gaussians into supergaussians based on geometric cues. A self-attention module operating at the supergaussian level is then employed to generate 3D feature vectors, enabling each Gaussian to be trained in a structurally informed manner rather than independently. Furthermore, we introduce a regularization term that constrains the positional deviation of Gaussians within the same supergaussian, which helps suppress floaters and alleviates structural collapse under sparse input conditions.
  • Figure 3: Qualitative Comparison on Blender. We train COSMOS and recent competitive models with 3 input views and render novel views for evaluation. COSMOS effectively suppresses floaters and restores fine high‑frequency details (see red boxes). The third‑row depth maps show that COSMOS reconstructs the smoothest and most continuous geometry without any depth supervision.
  • Figure 4: Qualitative Comparison on DTU. We train MVPGS, SplatFields, DNGaussian, and COSMOS using 3 input views and render novel views for evaluation. COSMOS demonstrates superior novel view synthesis performance, even on real-world datasets with complex textures and structures.
  • Figure 5: Qualitative Ablation Study of COSMOS. The red boxes highlight that the Full model captures the most detailed textures. In the depth map renderings, the object surfaces show that the Full model maintains the most consistent and continuous geometry.