Table of Contents
Fetching ...

DS-Span: Single-Phase Discriminative Subgraph Mining for Efficient Graph Embeddings

Yeamin Kaiser, Muhammed Tasnim Bin Anwar, Bholanath Das

TL;DR

This work tackles the inefficiencies of multi-phase discriminative subgraph mining in graph classification. It introduces DS-Span, a single-phase DFS-based framework that integrates pattern growth, pruning, and supervision-driven scoring via coverage-capped eligibility and information-gain-guided selection, yielding a compact, highly discriminative subgraph set. The mined features are used to construct normalized incidence vectors and a shallow CBOW-like embedding, achieving higher or competitive accuracy with substantially reduced mining time on standard benchmarks. Overall, DS-Span demonstrates that unified, coverage-aware discriminative mining can enhance scalability and interpretability in graph representation learning, with practical implications for domains like drug discovery.

Abstract

Graph representation learning seeks to transform complex, high-dimensional graph structures into compact vector spaces that preserve both topology and semantics. Among the various strategies, subgraph-based methods provide an interpretable bridge between symbolic pattern discovery and continuous embedding learning. Yet, existing frequent or discriminative subgraph mining approaches often suffer from redundant multi-phase pipelines, high computational cost, and weak coupling between mined structures and their discriminative relevance. We propose DS-Span, a single-phase discriminative subgraph mining framework that unifies pattern growth, pruning, and supervision-driven scoring within one traversal of the search space. DS-Span introduces a coverage-capped eligibility mechanism that dynamically limits exploration once a graph is sufficiently represented, and an information-gain-guided selection that promotes subgraphs with strong class-separating ability while minimizing redundancy. The resulting subgraph set serves as an efficient, interpretable basis for downstream graph embedding and classification. Extensive experiments across benchmarks demonstrate that DS-Span generates more compact and discriminative subgraph features than prior multi-stage methods, achieving higher or comparable accuracy with significantly reduced runtime. These results highlight the potential of unified, single-phase discriminative mining as a foundation for scalable and interpretable graph representation learning.

DS-Span: Single-Phase Discriminative Subgraph Mining for Efficient Graph Embeddings

TL;DR

This work tackles the inefficiencies of multi-phase discriminative subgraph mining in graph classification. It introduces DS-Span, a single-phase DFS-based framework that integrates pattern growth, pruning, and supervision-driven scoring via coverage-capped eligibility and information-gain-guided selection, yielding a compact, highly discriminative subgraph set. The mined features are used to construct normalized incidence vectors and a shallow CBOW-like embedding, achieving higher or competitive accuracy with substantially reduced mining time on standard benchmarks. Overall, DS-Span demonstrates that unified, coverage-aware discriminative mining can enhance scalability and interpretability in graph representation learning, with practical implications for domains like drug discovery.

Abstract

Graph representation learning seeks to transform complex, high-dimensional graph structures into compact vector spaces that preserve both topology and semantics. Among the various strategies, subgraph-based methods provide an interpretable bridge between symbolic pattern discovery and continuous embedding learning. Yet, existing frequent or discriminative subgraph mining approaches often suffer from redundant multi-phase pipelines, high computational cost, and weak coupling between mined structures and their discriminative relevance. We propose DS-Span, a single-phase discriminative subgraph mining framework that unifies pattern growth, pruning, and supervision-driven scoring within one traversal of the search space. DS-Span introduces a coverage-capped eligibility mechanism that dynamically limits exploration once a graph is sufficiently represented, and an information-gain-guided selection that promotes subgraphs with strong class-separating ability while minimizing redundancy. The resulting subgraph set serves as an efficient, interpretable basis for downstream graph embedding and classification. Extensive experiments across benchmarks demonstrate that DS-Span generates more compact and discriminative subgraph features than prior multi-stage methods, achieving higher or comparable accuracy with significantly reduced runtime. These results highlight the potential of unified, single-phase discriminative mining as a foundation for scalable and interpretable graph representation learning.

Paper Structure

This paper contains 15 sections, 6 equations, 1 figure, 3 tables, 2 algorithms.

Figures (1)

  • Figure 1: t-SNE visualisations for the D&D dataset comparing DS-Span embeddings, Gaussian noise, and the reproduced DisFPGC embeddings.