Table of Contents
Fetching ...

Generalized partitioned local depth

Kenneth S. Berenhaut, John D. Foley, Liangdongsheng Lyu

TL;DR

This work generalizes partitioned local depth (PaLD) by introducing probabilistic locality via $R_{x,y,z}$ and probabilistic support division via $Q_{x,y,z}$, enabling cohesion analysis under uncertainty and conflicting information. It formalizes generalized partitioned local depth $\ell_{S,\boldsymbol{R},\boldsymbol{Q}}(x)$ and cohesion $C_{x,w}$, proving core properties such as dissipation under separation and conservation of cohesion, and recovers the original PaLD model when $R$ and $Q$ are binary. The paper demonstrates practical applications, including combining multiple dissimilarity measures, event-based data, and data uncertainty, with examples from cultural distances, political analysis, and sports data (NBA). Overall, the framework enables robust inference of community structure in non-metric, uncertain data without requiring fixed neighborhoods or distributional assumptions, and opens avenues for uncertainty-aware network analysis and persistent-like studies.

Abstract

In this paper we provide a generalization of the concept of cohesion as introduced recently by Berenhaut, Moore and Melvin [Proceedings of the National Academy of Sciences, 119 (4) (2022)]. The formulation presented builds on the technique of partitioned local depth by distilling two key probabilistic concepts: local relevance and support division. Earlier results are extended within the new context, and examples of applications to revealing communities in data with uncertainty are included. The work sheds light on the foundations of partitioned local depth, and extends the original ideas to enable probabilistic consideration of uncertain, variable and potentially conflicting information.

Generalized partitioned local depth

TL;DR

This work generalizes partitioned local depth (PaLD) by introducing probabilistic locality via and probabilistic support division via , enabling cohesion analysis under uncertainty and conflicting information. It formalizes generalized partitioned local depth and cohesion , proving core properties such as dissipation under separation and conservation of cohesion, and recovers the original PaLD model when and are binary. The paper demonstrates practical applications, including combining multiple dissimilarity measures, event-based data, and data uncertainty, with examples from cultural distances, political analysis, and sports data (NBA). Overall, the framework enables robust inference of community structure in non-metric, uncertain data without requiring fixed neighborhoods or distributional assumptions, and opens avenues for uncertainty-aware network analysis and persistent-like studies.

Abstract

In this paper we provide a generalization of the concept of cohesion as introduced recently by Berenhaut, Moore and Melvin [Proceedings of the National Academy of Sciences, 119 (4) (2022)]. The formulation presented builds on the technique of partitioned local depth by distilling two key probabilistic concepts: local relevance and support division. Earlier results are extended within the new context, and examples of applications to revealing communities in data with uncertainty are included. The work sheds light on the foundations of partitioned local depth, and extends the original ideas to enable probabilistic consideration of uncertain, variable and potentially conflicting information.
Paper Structure (11 sections, 5 theorems, 30 equations, 6 figures, 1 algorithm)

This paper contains 11 sections, 5 theorems, 30 equations, 6 figures, 1 algorithm.

Key Result

Theorem 1

(Dissipation of cohesion under separation) Suppose $\boldsymbol{R}$ and $\boldsymbol{Q}$ are fixed, $S$ is a disjoint union of $A$ and $B$, and $A$ and $B$ are sufficiently separated with respect to $\boldsymbol{R}$ and $\boldsymbol{Q}$, then the between-set cohesion values are zero, i.e., $C_{a,b}$

Figures (6)

  • Figure 1: Cultural communities from survey data; adapted from bmm22 with permission. In A, we display the community structure obtained from the cultural fixation index values from [9] for regions within the United States, China, India, and the European Union. In B, we display the distribution of within-group cohesions and distances; colored bars for (mutual) cohesion indicate values above the threshold of 0.0217 (see (\ref{['TSd']})). Note that distances are brought to comparable levels of cohesion.
  • Figure 2: The local focus for a fixed point $x$ and a random point $Y$, in two-dimensional Euclidean space. The points in red are outside the focus. Those in green (and $Z$ in blue) are in the focus and closer to $x$, while those in grey are closer to $Y$.
  • Figure 3: Conceptual generative process for random triplet comparisons.
  • Figure 4: Cohesion networks based on $D^*$ as the relative weight on the Social dimension increases through the values $0,0.5,1.0,2.0,10,100$. Ties above the threshold in (\ref{['TSd']}) are displayed. The layout for each plot is that based on the Social dimension in isolation.
  • Figure 5: Cohesion networks based on $R^*$ and $Q^*$ as the relative weight on the Social dimension increases through the values $0,0.5,1.0,2.0,10,100$. Ties above the threshold in (\ref{['thresholdbd']}) are displayed. The layout for each plot is that based on the Social dimension in isolation.
  • ...and 1 more figures

Theorems & Definitions (5)

  • Theorem 1
  • Theorem 2
  • Theorem 3
  • Theorem 4
  • Theorem 5