Table of Contents
Fetching ...

Interpreting Contrastive Embeddings in Specific Domains with Fuzzy Rules

Javier Fumanal-Idocin, Mohammadreza Jamalifard, Javier Andreu-Perez

Abstract

Free-style text is still one of the common ways in which data is registered in real environments, like legal procedures and medical records. Because of that, there have been significant efforts in the area of natural language processing to convert these texts into a structured format, which standard machine learning methods can then exploit. One of the most popular methods to embed text into a vectorial representation is the Contrastive Language-Image Pre-training model (CLIP), which was trained using both image and text. Although the representations computed by CLIP have been very successful in zero-show and few-shot learning problems, they still have problems when applied to a particular domain. In this work, we use a fuzzy rule-based classification system along with some standard text procedure techniques to map some of our features of interest to the space created by a CLIP model. Then, we discuss the rules and associations obtained and the importance of each feature considered. We apply this approach in two different data domains, clinical reports and film reviews, and compare the results obtained individually and when considering both. Finally, we discuss the limitations of this approach and how it could be further improved.

Interpreting Contrastive Embeddings in Specific Domains with Fuzzy Rules

Abstract

Free-style text is still one of the common ways in which data is registered in real environments, like legal procedures and medical records. Because of that, there have been significant efforts in the area of natural language processing to convert these texts into a structured format, which standard machine learning methods can then exploit. One of the most popular methods to embed text into a vectorial representation is the Contrastive Language-Image Pre-training model (CLIP), which was trained using both image and text. Although the representations computed by CLIP have been very successful in zero-show and few-shot learning problems, they still have problems when applied to a particular domain. In this work, we use a fuzzy rule-based classification system along with some standard text procedure techniques to map some of our features of interest to the space created by a CLIP model. Then, we discuss the rules and associations obtained and the importance of each feature considered. We apply this approach in two different data domains, clinical reports and film reviews, and compare the results obtained individually and when considering both. Finally, we discuss the limitations of this approach and how it could be further improved.
Paper Structure (14 sections, 6 equations, 6 figures, 1 table)

This paper contains 14 sections, 6 equations, 6 figures, 1 table.

Figures (6)

  • Figure 1: Fuzzy partitions using interval-type 2 fuzzy sets for the considered variables in the clinical dataset.
  • Figure 2: Proposed methodology to exploit CLIP features and sentiment analysis using a FRBC. 1. The collection of texts to study. 2. We compute the CLIP embeddings for all texts. 3. We use the K-Means clustering algorithm to obtain the structures formed in this space. 4. For each text, we extract the features of the sentiment analysis. 5. The FRBC: it takes as input for each text the sentiment analysis features, using the fuzzy partitions as shown in Figure \ref{['fig:partitions_t2']}. Then, it uses the clusters detected in Step 3 as targets. In this way, the FRBC maps our features of interest to the spacial regions where they were projected in their embedding space.
  • Figure 3: Silhouette index for K-Means clustering applied to the CLIP features obtained from all the patient's reports.
  • Figure 4: Silhouette index for K-Means clustering applied to the CLIP features obtained for the Film dataset.
  • Figure 5: Results with rules obtained for patient reports that map sentiment metrics to clusters in CLIP space. $\rightarrow$ low, $\rightarrow$ medium, $\rightarrow$ high, $\rightarrow$ irrelevant. DS stands for dominance score, and Acc. for the accuracy obtained by each rule in the samples where it fired.
  • ...and 1 more figures