FI-CBL: A Probabilistic Method for Concept-Based Learning with Expert Rules
Lev V. Utkin, Andrei V. Konstantinov, Stanislav R. Kirpichenko
TL;DR
FI-CBL tackles concept-based learning by combining patch-level embeddings with a transparent frequentist-Bayesian inference framework and by enabling explicit incorporation of expert rules. Patches are clustered to form a distribution over concepts via empirical counts, and posteriors for each concept value are obtained through Bayes’ rule, with a multinomial model supporting inference for new images. A key contribution is the principled, rule-driven update of priors and conditionals, allowing expert knowledge to steer probabilistic reasoning, which yields strong performance in small-data regimes and provides interpretable, auditable decision logic. The work demonstrates practical advantages in medical-imaging-like tasks and underscores FI-CBL’s potential for robust, explainable AI under data-scarce conditions.
Abstract
A method for solving concept-based learning (CBL) problem is proposed. The main idea behind the method is to divide each concept-annotated image into patches, to transform the patches into embeddings by using an autoencoder, and to cluster the embeddings assuming that each cluster will mainly contain embeddings of patches with certain concepts. To find concepts of a new image, the method implements the frequentist inference by computing prior and posterior probabilities of concepts based on rates of patches from images with certain values of the concepts. Therefore, the proposed method is called the Frequentist Inference CBL (FI-CBL). FI-CBL allows us to incorporate the expert rules in the form of logic functions into the inference procedure. An idea behind the incorporation is to update prior and conditional probabilities of concepts to satisfy the rules. The method is transparent because it has an explicit sequence of probabilistic calculations and a clear frequency interpretation. Numerical experiments show that FI-CBL outperforms the concept bottleneck model in cases when the number of training data is small. The code of proposed algorithms is publicly available.
