Table of Contents
Fetching ...

DANI: Fast Diffusion Aware Network Inference with Preserving Topological Structure Property

Maryam Ramezani, Aryan Ahadinia, Erfan Farhadi, Hamid R. Rabiee

TL;DR

DANI addresses the challenge of inferring latent network topology from diffusion cascades under limited topology access while preserving key structural properties. It combines diffusion-driven Markov transitions with a structural similarity term to compute edge weights, yielding a weighted graph where $w_{uv} \\propto \\frac{\theta_{uv}}{\\lambda_{uv}}$. The method has linear time complexity $O(M N \, \overline{n_c}^2)$ and includes a scalable MapReduce pipeline, with experiments on synthetic LFR benchmarks and real networks showing improved accuracy and faster runtimes compared to baselines while maintaining modular structure, degree distribution, density, and clustering. This diffusion-aware, topology-preserving approach enables more faithful network reconstruction and supports downstream analyses like community detection and diffusion modeling in large-scale settings.

Abstract

The fast growth of social networks and their data access limitations in recent years has led to increasing difficulty in obtaining the complete topology of these networks. However, diffusion information over these networks is available, and many algorithms have been proposed to infer the underlying networks using this information. The previously proposed algorithms only focus on inferring more links and ignore preserving the critical topological characteristics of the underlying social networks. In this paper, we propose a novel method called DANI to infer the underlying network while preserving its structural properties. It is based on the Markov transition matrix derived from time series cascades, as well as the node-node similarity that can be observed in the cascade behavior from a structural point of view. In addition, the presented method has linear time complexity (increases linearly with the number of nodes, number of cascades, and square of the average length of cascades), and its distributed version in the MapReduce framework is also scalable. We applied the proposed approach to both real and synthetic networks. The experimental results showed that DANI has higher accuracy and lower run time while maintaining structural properties, including modular structure, degree distribution, connected components, density, and clustering coefficients, than well-known network inference methods.

DANI: Fast Diffusion Aware Network Inference with Preserving Topological Structure Property

TL;DR

DANI addresses the challenge of inferring latent network topology from diffusion cascades under limited topology access while preserving key structural properties. It combines diffusion-driven Markov transitions with a structural similarity term to compute edge weights, yielding a weighted graph where . The method has linear time complexity and includes a scalable MapReduce pipeline, with experiments on synthetic LFR benchmarks and real networks showing improved accuracy and faster runtimes compared to baselines while maintaining modular structure, degree distribution, density, and clustering. This diffusion-aware, topology-preserving approach enables more faithful network reconstruction and supports downstream analyses like community detection and diffusion modeling in large-scale settings.

Abstract

The fast growth of social networks and their data access limitations in recent years has led to increasing difficulty in obtaining the complete topology of these networks. However, diffusion information over these networks is available, and many algorithms have been proposed to infer the underlying networks using this information. The previously proposed algorithms only focus on inferring more links and ignore preserving the critical topological characteristics of the underlying social networks. In this paper, we propose a novel method called DANI to infer the underlying network while preserving its structural properties. It is based on the Markov transition matrix derived from time series cascades, as well as the node-node similarity that can be observed in the cascade behavior from a structural point of view. In addition, the presented method has linear time complexity (increases linearly with the number of nodes, number of cascades, and square of the average length of cascades), and its distributed version in the MapReduce framework is also scalable. We applied the proposed approach to both real and synthetic networks. The experimental results showed that DANI has higher accuracy and lower run time while maintaining structural properties, including modular structure, degree distribution, connected components, density, and clustering coefficients, than well-known network inference methods.
Paper Structure (20 sections, 18 equations, 12 figures, 7 tables, 1 algorithm)

This paper contains 20 sections, 18 equations, 12 figures, 7 tables, 1 algorithm.

Figures (12)

  • Figure 1: A toy example of one diffusion process over structural properties of the network (links and communities). Colored shadows separate the three communities. Strong ties (navy blue solid lines) form a community, and weak ties (green dash lines) act as bridges between different communities. Core nodes ( $\{A,C,D,F,G,H,L\}$), and boundary nodes ($\{B,E,I,K,M,N,O\}$) are shown. The directed path (red directed lines and nodes) shows a typical cascade over the network.
  • Figure 2: Problem definition. A set of cascades are the inputs. The proposed method should infer links between nodes. We evaluate the inferred network through topological similarities against ground truth.
  • Figure 3: (Up) A network topology and two cascades over the network. The infection of each user with contagion is represented by color with infection time. The cascade vector calculated by the proposed method is also considered. (Down) Markov chain and matrix for each cascade: probable directed graph with the DANI algorithm. Edge probabilities are marked with thick and thin lines. The Markov transition matrix per cascade is obtained from these graphs, as shown here.
  • Figure 4: DANI Algrotithm in the MapReduce framework
  • Figure 5: The heat map indicates the score assigned to each edge. Fig. \ref{['fig:CommunitySimilarityGT']} is for the ground truth network in which all edges are deterministically labeled. Cells between $i$ and $j$ are hot if there is an edge between $i$ and $j$ and are cool if there is no edge. The nodes are rearranged so that each community forms a block. Each cell in Fig. \ref{['fig:CommunitySimilarityDANI']} shows the value of Eq. \ref{['eq:edgeweight']} for each pair of nodes $(i, j)$. Darker cells indicate higher values which show higher diffusion-based similarity between two nodes. This leads to the conclusion that communities can be specified using node-node diffusion-based similarity.
  • ...and 7 more figures