Chromosomal Structural Abnormality Diagnosis by Homologous Similarity
Juren Li, Fanzhe Fu, Ran Wei, Yifei Sun, Zeyu Lai, Ning Song, Xin Chen, Yang Yang
TL;DR
This work tackles the challenging task of diagnosing chromosomal structural abnormalities, which are rare, morphologically diverse, and distribution-variant across hospitals. It introduces HomNet, a homologous-similarity driven framework consisting of CMSBlock, HomBlock, and BagBlock, trained via self-supervised learning with artificial abnormalities and then fine-tuned for each hospital. Across four real-world hospital datasets and a public dataset with artificial abnormalities, HomNet consistently outperforms baselines, with ablation studies confirming the value of pairwise alignment, multi-pair aggregation, and regional feature modeling. The approach is integrated into AutoVision for fast, millisecond-scale clinical inference, achieving high diagnostic accuracy and enabling scalable, multi-center chromosomal analysis in practice.
Abstract
Pathogenic chromosome abnormalities are very common among the general population. While numerical chromosome abnormalities can be quickly and precisely detected, structural chromosome abnormalities are far more complex and typically require considerable efforts by human experts for identification. This paper focuses on investigating the modeling of chromosome features and the identification of chromosomes with structural abnormalities. Most existing data-driven methods concentrate on a single chromosome and consider each chromosome independently, overlooking the crucial aspect of homologous chromosomes. In normal cases, homologous chromosomes share identical structures, with the exception that one of them is abnormal. Therefore, we propose an adaptive method to align homologous chromosomes and diagnose structural abnormalities through homologous similarity. Inspired by the process of human expert diagnosis, we incorporate information from multiple pairs of homologous chromosomes simultaneously, aiming to reduce noise disturbance and improve prediction performance. Extensive experiments on real-world datasets validate the effectiveness of our model compared to baselines.
