Bayesian and Convolutional Networks for Hierarchical Morphological Classification of Galaxies
Jonathan Serrano-Pérez, Raquel Díaz Hernández, L. Enrique Sucar
TL;DR
The paper tackles the challenge of hierarchical morphological classification of galaxies by proposing BCNN, a two-module system that couples a pre-trained CNN with a Bayesian network to enforce the Hubble-sequence hierarchy during inference. The BN captures hierarchical dependencies through $P(y_i|pa(y_i))$ while $q_i$ nodes model the base CNN outputs, enabling probabilistic top-down path inference to produce consistent predictions. Empirical results on Hubble telescope images show BCNN significantly outperforms standalone CNNs, with EM improving from around 57% to 64% and the hierarchical F-measure from roughly 77% to 81%, and further gains achieved via image augmentation and full-network fine-tuning, culminating in EM around 67% and hF around 83% when compared against multiple CNN baselines. The work demonstrates that explicit hierarchical modeling plus targeted data augmentation and fine-tuning yields meaningful advances in automated galaxy morphology classification, with practical impact for scalable, consistent morphology catalogs and downstream astrophysical analyses.
Abstract
This work is focused on the morphological classification of galaxies following the Hubble sequence in which the different classes are arranged in a hierarchy. The proposed method, BCNN, is composed of two main modules. First, a convolutional neural network (CNN) is trained with images of the different classes of galaxies (image augmentation is carried out to balance some classes); the CNN outputs the probability for each class of the hierarchy, and its outputs/predictions feed the second module. The second module consists of a Bayesian network that represents the hierarchy and helps to improve the prediction accuracy by combining the predictions of the first phase while maintaining the hierarchical constraint (in a hierarchy, an instance associated with a node must be associated to all its ancestors), through probabilistic inference over the Bayesian network so that a consistent prediction is obtained. Different images from the Hubble telescope have been collected and labeled by experts, which are used to perform the experiments. The results show that BCNN performed better than several CNNs in multiple evaluation measures, reaching the next scores: 67% in exact match, 78% in accuracy, and 83% in hierarchical F-measure.
