Table of Contents
Fetching ...

Effective Individual Fairest Community Search over Heterogeneous Information Networks

Taige Zhao, Jianxin Li, Ningning Cui, Wei Luo

TL;DR

This paper defines and tackles the problem of Individual Fairest Community Search over Heterogeneous Information Networks ($IFCS$ on HINs), aiming to identify a maximal set of vertices of a target type that are tightly connected via a target-aware motif while exhibiting minimal differences in activity, measured by a Gini-like fairness score. The authors model higher-order, target-aware relationships through motifs and introduce the $M$-graph to capture candidate connections among community members; fairness is quantified by the proposed FS metric, which depends on the active levels of community members. They prove the problem is NP-hard and propose a Filter-Verify framework with three optimization layers: an exploration-based filter to prune candidate targets, a message-passing strategy to avoid redundant motif enumerations, and a lower-bound pruning technique to discard unfair candidates early. Experimental results on four real-world HINs (IMDB, DBLP, Freebase, Amazon) demonstrate that the optimization-enhanced approach substantially outperforms baselines in efficiency while producing fairest, cohesive target-aware communities, and they show advantages over traditional cohesion models in terms of r-degree, PathSim, density, and M-distance. Overall, the work advances fair, semantically rich community search in heterogeneous networks and provides scalable methods for practical deployment.

Abstract

Community search over heterogeneous information networks has been applied to wide domains, such as activity organization and team formation. From these scenarios, the members of a group with the same treatment often have different levels of activity and workloads, which causes unfairness in the treatment between active members and inactive members (called individual unfairness). However, existing works do not pay attention to individual fairness and do not sufficiently consider the rich semantics of HINs (e.g., high-order structure), which disables complex queries. To fill the gap, we formally define the issue of individual fairest community search over HINs (denoted as IFCS), which aims to find a set of vertices from the HIN that own the same type, close relationships, and small difference of activity level and has been demonstrated to be NP-hard. To do this, we first develop an exploration-based filter that reduces the search space of the community effectively. Further, to avoid repeating computation and prune unfair communities in advance, we propose a message-based scheme and a lower bound-based scheme. At last, we conduct extensive experiments on four real-world datasets to demonstrate the effectiveness and efficiency of our proposed algorithms, which achieve at least X3 times faster than the baseline solution.

Effective Individual Fairest Community Search over Heterogeneous Information Networks

TL;DR

This paper defines and tackles the problem of Individual Fairest Community Search over Heterogeneous Information Networks ( on HINs), aiming to identify a maximal set of vertices of a target type that are tightly connected via a target-aware motif while exhibiting minimal differences in activity, measured by a Gini-like fairness score. The authors model higher-order, target-aware relationships through motifs and introduce the -graph to capture candidate connections among community members; fairness is quantified by the proposed FS metric, which depends on the active levels of community members. They prove the problem is NP-hard and propose a Filter-Verify framework with three optimization layers: an exploration-based filter to prune candidate targets, a message-passing strategy to avoid redundant motif enumerations, and a lower-bound pruning technique to discard unfair candidates early. Experimental results on four real-world HINs (IMDB, DBLP, Freebase, Amazon) demonstrate that the optimization-enhanced approach substantially outperforms baselines in efficiency while producing fairest, cohesive target-aware communities, and they show advantages over traditional cohesion models in terms of r-degree, PathSim, density, and M-distance. Overall, the work advances fair, semantically rich community search in heterogeneous networks and provides scalable methods for practical deployment.

Abstract

Community search over heterogeneous information networks has been applied to wide domains, such as activity organization and team formation. From these scenarios, the members of a group with the same treatment often have different levels of activity and workloads, which causes unfairness in the treatment between active members and inactive members (called individual unfairness). However, existing works do not pay attention to individual fairness and do not sufficiently consider the rich semantics of HINs (e.g., high-order structure), which disables complex queries. To fill the gap, we formally define the issue of individual fairest community search over HINs (denoted as IFCS), which aims to find a set of vertices from the HIN that own the same type, close relationships, and small difference of activity level and has been demonstrated to be NP-hard. To do this, we first develop an exploration-based filter that reduces the search space of the community effectively. Further, to avoid repeating computation and prune unfair communities in advance, we propose a message-based scheme and a lower bound-based scheme. At last, we conduct extensive experiments on four real-world datasets to demonstrate the effectiveness and efficiency of our proposed algorithms, which achieve at least X3 times faster than the baseline solution.
Paper Structure (18 sections, 1 theorem, 7 equations, 11 figures, 3 tables, 4 algorithms)

This paper contains 18 sections, 1 theorem, 7 equations, 11 figures, 3 tables, 4 algorithms.

Key Result

Corollary 1

Given an HIN $G$, a query vertex $u$ of a motif $q$, the BFS order $\pi$ of $q$ and a vertex $v$ in $G$. If $v$ satisfies the exact star isomorphism constraint, it must hold the following two conditions: (1) $\forall u' \in N(u) \wedge idx_{u'}(\pi) < idx_u(\pi)$, $\exists v' \in N(v)$ such that $v'

Figures (11)

  • Figure 1: An example of HIN
  • Figure 2: An example of fairness community search
  • Figure 3: Example of exploration-based filtering search
  • Figure 4: An example of message-passing strategy
  • Figure 5: Time cost of Target-aware Community Search on Four Datasets under Default Parameter Setting
  • ...and 6 more figures

Theorems & Definitions (17)

  • Example 1
  • Definition 1: Motif milo2002network
  • Definition 2: Instance of Motif milo2002network
  • Definition 3: Target-aware Motif
  • Definition 4: Instances of Target Vertex
  • Definition 5: Instances of Motif around Instance of Target Vertex
  • Definition 6: Active Level of Target Vertex Instance
  • Definition 7: Target-aware Community
  • Definition 8: $M$-graph
  • Definition 9: Fairness Score of Target-Aware Community
  • ...and 7 more