Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

Jie Xu; Yazhou Ren; Xiaolong Wang; Lei Feng; Zheng Zhang; Gang Niu; Xiaofeng Zhu

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

Jie Xu, Yazhou Ren, Xiaolong Wang, Lei Feng, Zheng Zhang, Gang Niu, Xiaofeng Zhu

TL;DR

A novel MVC objective is proposed that enables un-shared parameters and inconsistent clustering predictions across multiple views to reduce the side effects of noisy views and a two-level multi-view iterative optimization is designed to generate robust learning targets for refining individual views' representation learning.

Abstract

Multi-view clustering (MVC) aims at exploring category structures among multi-view data in self-supervised manners. Multiple views provide more information than single views and thus existing MVC methods can achieve satisfactory performance. However, their performance might seriously degenerate when the views are noisy in practical multi-view scenarios. In this paper, we formally investigate the drawback of noisy views and then propose a theoretically grounded deep MVC method (namely MVCAN) to address this issue. Specifically, we propose a novel MVC objective that enables un-shared parameters and inconsistent clustering predictions across multiple views to reduce the side effects of noisy views. Furthermore, a two-level multi-view iterative optimization is designed to generate robust learning targets for refining individual views' representation learning. Theoretical analysis reveals that MVCAN works by achieving the multi-view consistency, complementarity, and noise robustness. Finally, experiments on extensive public datasets demonstrate that MVCAN outperforms state-of-the-art methods and is robust against the existence of noisy views.

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

TL;DR

Abstract

Paper Structure (14 sections, 10 theorems, 35 equations, 10 figures, 11 tables, 1 algorithm)

This paper contains 14 sections, 10 theorems, 35 equations, 10 figures, 11 tables, 1 algorithm.

Introduction
Background and Analysis
Preliminaries
Analysis of Noisy-View Drawback (NVD)
Methodology
Clustering Objective Against NVD
Two-Level Multi-View Iterative Optimization
Theoretical Analysis of Multi-View Consistency & Complementarity & Noise Robustness
Experiments
Settings
Comparison Results and Analysis
Ablation Study
Model Analysis
Conclusion

Key Result

Theorem 1

Denoting $\mathbf{\check{Y}} = \mathbf{L} \mathbf{A}$, where $\mathbf{A} \in \{0,1\}^{K\times K}$ makes $\mathbf{\check{Y}}$ maximally match the learning target $\mathbf{T}$. Then, the clustering accuracy can be calculated as $ACC = \frac{1}{N} \left(N - \frac{1}{2}\| \mathbf{\check{Y}} - \mathbf{T}

Figures (10)

Figure 1: Different training iterations of $\mathcal{T}$-level (a) and $\mathcal{R}$-level (b) in the proposed two-level multi-view iterative optimization.
Figure 1: Illustration of the noisy-view drawback (NVD). The informative views have distinct representation similarities, which can promote clustering due to their consistency and complementarity. However, the noisy views have indistinct representation similarities. For instance, the views extracted from faulty or inapplicable sensors will bring noisy information and hinder clustering, making it be of practical significance to investigate the noise robustness.
Figure 2: Loss $vs.$ Epoch on BDGP and NoisyBDGP.
Figure 2: The frame diagram of our MVCAN. Specifically, (a) MVCAN utilizes the two-level multi-view iterative optimization strategy to train the model for clustering multi-view data $\{\mathbf{X}^v\}_{v=1}^V$, where (b) $\mathcal{R}$-level iteration adjusts $\mathbf{T}$ for corresponding to each $\mathbf{Y}^v$ by $\mathbf{A}^v$ and updates the decoupled model with un-shared network parameters for $V$ views, to obtain their individual representations $\{\mathbf{Z}^v\}_{v=1}^V$ and clustering soft labels $\{\mathbf{Y}^v\}_{v=1}^V$; (c) $\mathcal{T}$-level iteration is established to infer the robust learning target $\mathbf{T}$ based on the already learned $\{\mathbf{Z}^v\}_{v=1}^V$ and $\{\mathbf{Y}^v\}_{v=1}^V$ (including the iterations of scaling matrix $\mathbf{W}_{(t)}$, scaled representation $\mathbf{Z}_{(t)}$, and robust soft labels $\mathbf{Y}_{(t)}$). Note:$\mathbf{W}_{(t)}$ plays the role of scaling the values of different dimensions of $\mathbf{Z}_{(t)}$, so we name it the scaling matrix. $\mathbf{Y}_{(t)}$ finally predicts the clustering results.
Figure 3: ACC and NMI $vs.$$\lambda$ on different datasets.
...and 5 more figures

Theorems & Definitions (19)

Theorem 1
Definition 1
Definition 2
Theorem 2
Theorem 3
Theorem 4
Theorem 5
Definition 1
Definition 2
Theorem 1
...and 9 more

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

TL;DR

Abstract

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (10)

Theorems & Definitions (19)