Table of Contents
Fetching ...

Decomposing God Header File via Multi-View Graph Clustering

Yue Wang, Wenhui Chang, Tongwei Deng, Yanzhen Zou, Bing Xie

TL;DR

A novel multi-view graph clustering based approach for decomposing God Header Files that first constructs and coarsens the code element graph, then a novel multi-view graph clustering algorithm is applied to identify the clusters and a heuristic algorithm is introduced to address the cyclic dependencies in the clustering results.

Abstract

God Header Files, just like God Classes, pose significant challenges for code comprehension and maintenance. Additionally, they increase the time required for code recompilation. However, existing refactoring methods for God Classes are inappropriate to deal with God Header Files because the code elements in header files are mostly short declaration types, and build dependencies of the entire system should be considered with the aim of improving compilation efficiency. Meanwhile, ensuring acyclic dependencies among the decomposed sub-header files is also crucial in the God Header File decomposition. This paper proposes a multi-view graph clustering based approach for decomposing God Header Files. It first constructs and coarsens the code element graph, then a novel multi-view graph clustering algorithm is applied to identify the clusters and a heuristic algorithm is introduced to address the cyclic dependencies in the clustering results. To evaluate our approach, we built both a synthetic dataset and a real-world God Header Files dataset. The results show that 1) Our approach could achieve 11.5% higher accuracy than existing God Class refactoring methods; 2) Our decomposition results attain better architecture on real-world God Header Files, evidenced by higher modularity and acyclic dependencies; 3) We can reduce 15% to 60% recompilation time for historical commits that require recompiling.

Decomposing God Header File via Multi-View Graph Clustering

TL;DR

A novel multi-view graph clustering based approach for decomposing God Header Files that first constructs and coarsens the code element graph, then a novel multi-view graph clustering algorithm is applied to identify the clusters and a heuristic algorithm is introduced to address the cyclic dependencies in the clustering results.

Abstract

God Header Files, just like God Classes, pose significant challenges for code comprehension and maintenance. Additionally, they increase the time required for code recompilation. However, existing refactoring methods for God Classes are inappropriate to deal with God Header Files because the code elements in header files are mostly short declaration types, and build dependencies of the entire system should be considered with the aim of improving compilation efficiency. Meanwhile, ensuring acyclic dependencies among the decomposed sub-header files is also crucial in the God Header File decomposition. This paper proposes a multi-view graph clustering based approach for decomposing God Header Files. It first constructs and coarsens the code element graph, then a novel multi-view graph clustering algorithm is applied to identify the clusters and a heuristic algorithm is introduced to address the cyclic dependencies in the clustering results. To evaluate our approach, we built both a synthetic dataset and a real-world God Header Files dataset. The results show that 1) Our approach could achieve 11.5% higher accuracy than existing God Class refactoring methods; 2) Our decomposition results attain better architecture on real-world God Header Files, evidenced by higher modularity and acyclic dependencies; 3) We can reduce 15% to 60% recompilation time for historical commits that require recompiling.
Paper Structure (25 sections, 7 equations, 5 figures, 3 tables, 1 algorithm)

This paper contains 25 sections, 7 equations, 5 figures, 3 tables, 1 algorithm.

Figures (5)

  • Figure 1: An example of God Header File (guc.h).
  • Figure 2: Joint distribution of header files' code size and file impact.
  • Figure 3: Overview of our God Header File Decomposition Approach.
  • Figure 4: Modularity of different methods on real-world God Header Files. Black cross markers represent the decomposition results with cyclic dependencies.
  • Figure 5: The process of cyclic dependency fixing of settings.h(#subfiles=7). Each circle represents a subfile with a label $i:(|C_i|)$. Each arrow represents an include relationship. And the subfiles involved in cycles are marked with color red.