Inductive Meta-path Learning for Schema-complex Heterogeneous Information Networks
Shixuan Liu, Changjun Fan, Kewei Cheng, Yunfei Wang, Peng Cui, Yizhou Sun, Zhong Liu
TL;DR
SchemaWalk reframes meta-path learning for schema-complex heterogeneous information networks as an inductive, schema-level problem, using a reinforcement-learning path-finding agent on the schema graph and learning schema-level representations to avoid enumeration of path instances. By combining an encoder-decoder policy network with a reward that reflects meta-path coverage and confidence, SchemaWalk discovers high-quality meta-paths for multiple relations, including unseen ones, and demonstrates strong performance in multi-relational inductive and transductive KB reasoning as well as per-relation experiments on KBs and a schema-simple HIN. The approach provides improved explainability and efficiency over instance-based or embedding-centric methods and scales to large knowledge bases, with robust performance under partial evidence and meaningful meta-path extraction. The work suggests practical impact for KB reasoning, link prediction, and downstream tasks, and points to future extensions such as task-specific reward design, faster evaluation of meta-paths, and temporal or dynamic HINs.
Abstract
Heterogeneous Information Networks (HINs) are information networks with multiple types of nodes and edges. The concept of meta-path, i.e., a sequence of entity types and relation types connecting two entities, is proposed to provide the meta-level explainable semantics for various HIN tasks. Traditionally, meta-paths are primarily used for schema-simple HINs, e.g., bibliographic networks with only a few entity types, where meta-paths are often enumerated with domain knowledge. However, the adoption of meta-paths for schema-complex HINs, such as knowledge bases (KBs) with hundreds of entity and relation types, has been limited due to the computational complexity associated with meta-path enumeration. Additionally, effectively assessing meta-paths requires enumerating relevant path instances, which adds further complexity to the meta-path learning process. To address these challenges, we propose SchemaWalk, an inductive meta-path learning framework for schema-complex HINs. We represent meta-paths with schema-level representations to support the learning of the scores of meta-paths for varying relations, mitigating the need of exhaustive path instance enumeration for each relation. Further, we design a reinforcement-learning based path-finding agent, which directly navigates the network schema (i.e., schema graph) to learn policies for establishing meta-paths with high coverage and confidence for multiple relations. Extensive experiments on real data sets demonstrate the effectiveness of our proposed paradigm.
