Table of Contents
Fetching ...

Importance inversion transfer identifies shared principles for cross-domain learning

Daniele Caligiore

TL;DR

This work introduces Explainable Cross-Domain Transfer Learning (X-CDTL), a framework that identifies domain-invariant topological anchors via Importance Inversion Transfer (IIT) to enable robust cross-domain learning across highly heterogeneous networks. By combining network science with explainable AI, the approach isolates stable structural invariants and uses a two-tier alignment (Global IIT_score,G and pairwise IIT_score) followed by PCA-SVD synchronization to transfer knowledge without relying on opaque latent embeddings. Empirical results across social, molecular, protein, and linguistic networks demonstrate a compact set of eight anchors that support cross-domain anomaly detection, with up to a 56% improvement in decision stability under extreme noise and data scarcity, and a positive correlation between IIT_anchor strength and transfer gains. The framework reveals a transfer paradox where maximal generalization arises at intermediate structural similarity and shows a diversity-driven rescue effect in highly corrupted regimes, underscoring the value of interpretable topological laws for principled generalization and scientific discovery across disciplines.

Abstract

The capacity to transfer knowledge across scientific domains relies on shared organizational principles. However, existing transfer-learning methodologies often fail to bridge radically heterogeneous systems, particularly under severe data scarcity or stochastic noise. This study formalizes Explainable Cross-Domain Transfer Learning (X-CDTL), a framework unifying network science and explainable artificial intelligence to identify structural invariants that generalize across biological, linguistic, molecular, and social networks. By introducing the Importance Inversion Transfer (IIT) mechanism, the framework prioritizes domain-invariant structural anchors over idiosyncratic, highly discriminative features. In anomaly detection tasks, models guided by these principles achieve significant performance gains - exhibiting a 56\% relative improvement in decision stability under extreme noise - over traditional baselines. These results provide evidence for a shared organizational signature across heterogeneous domains, establishing a principled paradigm for cross-disciplinary knowledge propagation. By shifting from opaque latent representations to explicit structural laws, this work advances machine learning as a robust engine for scientific discovery.

Importance inversion transfer identifies shared principles for cross-domain learning

TL;DR

This work introduces Explainable Cross-Domain Transfer Learning (X-CDTL), a framework that identifies domain-invariant topological anchors via Importance Inversion Transfer (IIT) to enable robust cross-domain learning across highly heterogeneous networks. By combining network science with explainable AI, the approach isolates stable structural invariants and uses a two-tier alignment (Global IIT_score,G and pairwise IIT_score) followed by PCA-SVD synchronization to transfer knowledge without relying on opaque latent embeddings. Empirical results across social, molecular, protein, and linguistic networks demonstrate a compact set of eight anchors that support cross-domain anomaly detection, with up to a 56% improvement in decision stability under extreme noise and data scarcity, and a positive correlation between IIT_anchor strength and transfer gains. The framework reveals a transfer paradox where maximal generalization arises at intermediate structural similarity and shows a diversity-driven rescue effect in highly corrupted regimes, underscoring the value of interpretable topological laws for principled generalization and scientific discovery across disciplines.

Abstract

The capacity to transfer knowledge across scientific domains relies on shared organizational principles. However, existing transfer-learning methodologies often fail to bridge radically heterogeneous systems, particularly under severe data scarcity or stochastic noise. This study formalizes Explainable Cross-Domain Transfer Learning (X-CDTL), a framework unifying network science and explainable artificial intelligence to identify structural invariants that generalize across biological, linguistic, molecular, and social networks. By introducing the Importance Inversion Transfer (IIT) mechanism, the framework prioritizes domain-invariant structural anchors over idiosyncratic, highly discriminative features. In anomaly detection tasks, models guided by these principles achieve significant performance gains - exhibiting a 56\% relative improvement in decision stability under extreme noise - over traditional baselines. These results provide evidence for a shared organizational signature across heterogeneous domains, establishing a principled paradigm for cross-disciplinary knowledge propagation. By shifting from opaque latent representations to explicit structural laws, this work advances machine learning as a robust engine for scientific discovery.
Paper Structure (44 sections, 4 equations, 8 figures, 9 tables)

This paper contains 44 sections, 4 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: Structural landscapes and topological diversity across domains. Representative graph samples illustrate distinct density, modularity, and branching patterns. Connectivity definitions vary across scientific scales: in social networks, nodes represent users connected by friendships; in molecular graphs, nodes denote atoms joined by chemical bonds; in protein networks, nodes represent amino acids linked by physical interactions; and in linguistic networks, nodes denote words connected by contextual co-occurrence. These architectures underpin the X-CDTL framework by providing a heterogeneous set of structural priors for the manifold alignment pipeline.
  • Figure 2: Global ranking of structural anchors. Hierarchy of the topological feature space based on the Global Consensus $\text{IIT}_{\text{score, G}}$. The green bars identify the eight structural anchors. The hierarchy reveals a significant reordering of features compared to raw discriminative importance and identifies a distinct performance gap after the eighth descriptor, justifying the parsimonious 12/8 configuration. The red dashed line designates the cut-off boundary that isolates domain-invariant anchors from idiosyncratic, noise-prone markers.
  • Figure 3: Predictive validation of the IIT score. Linear regression between the aggregate pairwise IIT Score ($\overline{\text{IIT}}_{\text{score}}$) and the realized Transfer Gain Index (TGI). The substantial positive correlation ($r = 0.503$) identifies the $\text{IIT}_{\text{score}}$ as a robust predictor of transfer effectiveness. High-affinity pairs, notably Proteins $\to$ Linguistic (green star), occupy the upper-right region, whereas outliers above the regression line identify a diversity-driven rescue effect where rigid structural skeletons regularize noisy target manifolds.
  • Figure 4: Workflow of the IIT strategy within the X-CDTL framework. The pipeline is organized into three sequential modules. Stage I: Network Characterization. Relational data from social, molecular, protein, and linguistic domains are transformed into a unified graph-based representation to extract multi-scale topological descriptors spanning local, mesoscopic, and global scales. Stage II: Importance Inversion Transfer (IIT). A consensus-based Borda Inverse Ranking strategy is employed to identify structural anchors by prioritizing features with the lowest discriminative utility for domain classification and the highest cross-domain stability. Stage III: Aligned Knowledge Transfer. Target and source manifolds undergo z-score standardization and PCA-SVD alignment. This manifold synchronization enables robust cross-domain anomaly detection under severe data scarcity and noise, effectively yielding a common structural fingerprint (structural anchors) that captures shared organizational principles across disparate scientific fields.
  • Figure 5: Multi-scale structural distributions across network domains. Comparative boxplots for the four graph domains demonstrate the high separability of the structural feature space. The wide range of outliers underscores the inherent complexity of real-world relational data, while the robust non-overlapping distributions for key metrics like Density and Spectral Radius validate the selection of these domains as a demanding benchmark for cross-domain alignment. The pooled dataset of 20,000 graphs ($N=5,000$ per domain) underlies all distributional plots.
  • ...and 3 more figures