T- Hop: A framework for studying the importance path information in molecular graphs for chemical property prediction

Abdulrahman Ibraheem; Narsis Kiani; Jesper Tegner

T- Hop: A framework for studying the importance path information in molecular graphs for chemical property prediction

Abdulrahman Ibraheem, Narsis Kiani, Jesper Tegner

TL;DR

This work investigates whether incorporating path information in molecular graphs improves QSAR predictions. The authors introduce T-Hop, a GNN-like framework with two modes: a non-degenerate mode that leverages path information via a tensor-based path representation $T^L$ and a learnable matrix $M$, and a degenerate mode that relies only on the adjacency $A$. Empirical results on six MoleculeNet datasets reveal that the benefit of path information is dataset-dependent, with the degenerate variant sometimes outperforming state-of-the-art methods despite using only 2-D information. They also demonstrate a first-step approach to predict upfront whether path information will help, using 36 graph-derived features per dataset, achieving 66.7% accuracy on a small test set. Overall, the paper highlights both the value and the cost of path information, offering a practical path to selectively apply it and showing that simple models can rival more complex ones in 2-D settings.

Abstract

This paper studies the usefulness of incorporating path information in predicting chemical properties from molecular graphs, in the domain of QSAR (Quantitative Structure-Activity Relationship). Towards this, we developed a GNN-style model which can be toggled to operate in one of two modes: a non-degenerate mode which incorporates path information, and a degenerate mode which leaves out path information. Thus, by comparing the performance of the non-degenerate mode versus the degenerate mode on relevant QSAR datasets, we were able to directly assess the significance of path information on those datasets. Our results corroborate previous works, by suggesting that the usefulness of path information is datasetdependent. Unlike previous studies however, we took the very first steps towards building a model that could predict upfront whether or not path information would be useful for a given dataset at hand. Moreover, we also found that, albeit its simplicity, the degenerate mode of our model yielded rather surprising results, which outperformed more sophisticated SOTA models in certain cases.

T- Hop: A framework for studying the importance path information in molecular graphs for chemical property prediction

TL;DR

and a learnable matrix

, and a degenerate mode that relies only on the adjacency

. Empirical results on six MoleculeNet datasets reveal that the benefit of path information is dataset-dependent, with the degenerate variant sometimes outperforming state-of-the-art methods despite using only 2-D information. They also demonstrate a first-step approach to predict upfront whether path information will help, using 36 graph-derived features per dataset, achieving 66.7% accuracy on a small test set. Overall, the paper highlights both the value and the cost of path information, offering a practical path to selectively apply it and showing that simple models can rival more complex ones in 2-D settings.

Abstract

Paper Structure (9 sections, 3 equations, 1 figure, 7 tables)

This paper contains 9 sections, 3 equations, 1 figure, 7 tables.

Introduction
Related work
Introducing the T-Hop Framework
Connection between $\mathcal{T}^{L}$ and the powered adjacency matrix.
Experiments
Juxtaposition of degnerate case against non-degenerate case and relationship between accuracy and maximum path length
Towards upfront prediction of when path information helps
Comparing T-Hop with the SOTA
Conclusion

Figures (1)

Figure 1: An illustrational graph of five nodes

Theorems & Definitions (3)

Definition 4.1: Cardinality of multiset $\mathcal{P}^L$
proof
proof

T- Hop: A framework for studying the importance path information in molecular graphs for chemical property prediction

TL;DR

Abstract

T- Hop: A framework for studying the importance path information in molecular graphs for chemical property prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (1)

Theorems & Definitions (3)