Table of Contents
Fetching ...

LMM-PCQA: Assisting Point Cloud Quality Assessment with LMM

Zicheng Zhang, Haoning Wu, Yingjie Zhou, Chunyi Li, Wei Sun, Chaofeng Chen, Xiongkuo Min, Xiaohong Liu, Weisi Lin, Guangtao Zhai

TL;DR

This work introduces LMM-PCQA, a text-supervised approach that brings large multi-modality models to point cloud quality assessment by treating point clouds as sequences of 2D projections and mapping MOS labels to textual adjectives. The method fuses LMM-derived quality logits from six projections with multi-scale geometric features—captured via linearity and planarity statistics across multiple k-NN scales—through SVR to predict point-cloud quality. Empirical results on SJTU-PCQA, WPC, and WPC2.0 show competitive or superior performance against FR- and NR-PCQA baselines, with strong cross-database generalization and distortion-specific robustness. The study provides a practical framework for integrating LMMs into 3D visual quality analysis and offers a codebase for future research.

Abstract

Although large multi-modality models (LMMs) have seen extensive exploration and application in various quality assessment studies, their integration into Point Cloud Quality Assessment (PCQA) remains unexplored. Given LMMs' exceptional performance and robustness in low-level vision and quality assessment tasks, this study aims to investigate the feasibility of imparting PCQA knowledge to LMMs through text supervision. To achieve this, we transform quality labels into textual descriptions during the fine-tuning phase, enabling LMMs to derive quality rating logits from 2D projections of point clouds. To compensate for the loss of perception in the 3D domain, structural features are extracted as well. These quality logits and structural features are then combined and regressed into quality scores. Our experimental results affirm the effectiveness of our approach, showcasing a novel integration of LMMs into PCQA that enhances model understanding and assessment accuracy. We hope our contributions can inspire subsequent investigations into the fusion of LMMs with PCQA, fostering advancements in 3D visual quality analysis and beyond. The code is available at https://github.com/zzc-1998/LMM-PCQA.

LMM-PCQA: Assisting Point Cloud Quality Assessment with LMM

TL;DR

This work introduces LMM-PCQA, a text-supervised approach that brings large multi-modality models to point cloud quality assessment by treating point clouds as sequences of 2D projections and mapping MOS labels to textual adjectives. The method fuses LMM-derived quality logits from six projections with multi-scale geometric features—captured via linearity and planarity statistics across multiple k-NN scales—through SVR to predict point-cloud quality. Empirical results on SJTU-PCQA, WPC, and WPC2.0 show competitive or superior performance against FR- and NR-PCQA baselines, with strong cross-database generalization and distortion-specific robustness. The study provides a practical framework for integrating LMMs into 3D visual quality analysis and offers a codebase for future research.

Abstract

Although large multi-modality models (LMMs) have seen extensive exploration and application in various quality assessment studies, their integration into Point Cloud Quality Assessment (PCQA) remains unexplored. Given LMMs' exceptional performance and robustness in low-level vision and quality assessment tasks, this study aims to investigate the feasibility of imparting PCQA knowledge to LMMs through text supervision. To achieve this, we transform quality labels into textual descriptions during the fine-tuning phase, enabling LMMs to derive quality rating logits from 2D projections of point clouds. To compensate for the loss of perception in the 3D domain, structural features are extracted as well. These quality logits and structural features are then combined and regressed into quality scores. Our experimental results affirm the effectiveness of our approach, showcasing a novel integration of LMMs into PCQA that enhances model understanding and assessment accuracy. We hope our contributions can inspire subsequent investigations into the fusion of LMMs with PCQA, fostering advancements in 3D visual quality analysis and beyond. The code is available at https://github.com/zzc-1998/LMM-PCQA.
Paper Structure (30 sections, 11 equations, 3 figures, 6 tables)

This paper contains 30 sections, 11 equations, 3 figures, 6 tables.

Figures (3)

  • Figure 1: The framework of the proposed method.
  • Figure 2: Illustration of the LLM evaluation pipeline. The point clouds with MOSs are transformed into question-answer pairs for LMM tuning. The LMM evaluation results can be obtained as the set of the probabilities to the predefined qualitative adjectives.
  • Figure 3: SRCC/PLCC performance tendency according to the number of used projections on the SJTU-PCQA, WPC, and WPC2.0 databases.