Table of Contents
Fetching ...

BERT4FCA: A Method for Bipartite Link Prediction using Formal Concept Analysis and BERT

Siqi Peng, Hongyuan Yang, Akihiro Yamamoto

TL;DR

BERT4FCA addresses bipartite link prediction by marrying formal concept analysis with a pre-trained BERT framework to exploit rich information in concept lattices. The method converts bipartite networks into formal contexts, extracts maximal bi-cliques as formal concepts, and uses neighboring relations in the concept lattice through a dedicated pre-training regime with tasks for masked token prediction and concept-neighbor prediction. Fine-tuning then handles both O-O and O-A prediction tasks, leveraging two specialized models (object and attribute) and eliminating position embeddings to reflect order-free concept structures. Empirical results on three real-world datasets show that BERT4FCA consistently outperforms previous FCA-based methods and classic baselines, with ablations confirming the value of incorporating neighboring relations and concept-lattice information. Overall, BERT4FCA provides a general framework for applying BERT to FCA-derived representations, with potential applicability to broader FCA-related prediction and discovery tasks.

Abstract

We propose BERT4FCA, a novel method for link prediction in bipartite networks, using formal concept analysis (FCA) and BERT. Link prediction in bipartite networks is an important task that can solve various practical problems like friend recommendation in social networks and co-authorship prediction in author-paper networks. Recent research has found that in bipartite networks, maximal bi-cliques provide important information for link prediction, and they can be extracted by FCA. Some FCA-based bipartite link prediction methods have achieved good performance. However, we figured out that their performance could be further improved because these methods did not fully capture the rich information of the extracted maximal bi-cliques. To address this limitation, we propose an approach using BERT, which can learn more information from the maximal bi-cliques extracted by FCA and use them to make link prediction. We conduct experiments on three real-world bipartite networks and demonstrate that our method outperforms previous FCA-based methods, and some classic methods such as matrix-factorization and node2vec.

BERT4FCA: A Method for Bipartite Link Prediction using Formal Concept Analysis and BERT

TL;DR

BERT4FCA addresses bipartite link prediction by marrying formal concept analysis with a pre-trained BERT framework to exploit rich information in concept lattices. The method converts bipartite networks into formal contexts, extracts maximal bi-cliques as formal concepts, and uses neighboring relations in the concept lattice through a dedicated pre-training regime with tasks for masked token prediction and concept-neighbor prediction. Fine-tuning then handles both O-O and O-A prediction tasks, leveraging two specialized models (object and attribute) and eliminating position embeddings to reflect order-free concept structures. Empirical results on three real-world datasets show that BERT4FCA consistently outperforms previous FCA-based methods and classic baselines, with ablations confirming the value of incorporating neighboring relations and concept-lattice information. Overall, BERT4FCA provides a general framework for applying BERT to FCA-derived representations, with potential applicability to broader FCA-related prediction and discovery tasks.

Abstract

We propose BERT4FCA, a novel method for link prediction in bipartite networks, using formal concept analysis (FCA) and BERT. Link prediction in bipartite networks is an important task that can solve various practical problems like friend recommendation in social networks and co-authorship prediction in author-paper networks. Recent research has found that in bipartite networks, maximal bi-cliques provide important information for link prediction, and they can be extracted by FCA. Some FCA-based bipartite link prediction methods have achieved good performance. However, we figured out that their performance could be further improved because these methods did not fully capture the rich information of the extracted maximal bi-cliques. To address this limitation, we propose an approach using BERT, which can learn more information from the maximal bi-cliques extracted by FCA and use them to make link prediction. We conduct experiments on three real-world bipartite networks and demonstrate that our method outperforms previous FCA-based methods, and some classic methods such as matrix-factorization and node2vec.
Paper Structure (18 sections, 4 equations, 5 figures, 6 tables, 1 algorithm)

This paper contains 18 sections, 4 equations, 5 figures, 6 tables, 1 algorithm.

Figures (5)

  • Figure 1: An example of a bipartite network $(U,V,E)$ and two of its bi-cliques. The nodes in blue form the node set $U$, and the nodes in red form the node set $V$. The two sub-networks framed in green and purple are two maximal bi-cliques of the network.
  • Figure 2: Left: A sample formal context. Right: The concept lattice corresponding to the formal context in the left panel. The nodes in yellow are neighbors and the nodes in red are not neighbors.
  • Figure 3: A depiction of the equivalence between bipartite networks and formal contexts, as well as the equivalence between maximal bi-cliques and formal concepts. The bipartite network to the left can be represented as the formal context to the right. The sub-network circled in purple and green are maximal bi-cliques in the bipartite networks to the left, which can be represented into two formal concepts framed in the corresponding colors in the formal context to the right.
  • Figure 4: An overview of the working flow of our method.
  • Figure 5: The comparison of how much information from a concept lattice is learned and used when predicting an object by two methods, with object2vec shown on the left and BERT4FCA shown on the right. The target object to be predicted is circled in blue. The information used for predicting the object is shown in red.