Dr. Tongue: Sign-Oriented Multi-label Detection for Remote Tongue Diagnosis
Yiliang Chen, Steven SC Ho, Cheng Xu, Yao Jie Xie, Wing-Fai Yeung, Shengfeng He, Jing Qin
TL;DR
This work tackles remote tongue diagnosis in telemedicine by introducing the TongueDx dataset and a Sign-Oriented multi-label framework (SignNet) that fuses whole-tongue, body, and edge information through region-aware attention. The system starts with Adaptive Tongue Feature Extraction (ATFE) to detect, segment, and upright-align tongue images, then applies SignNet to predict eight tongue surface attributes with inter-sign relationships encoded in a graph. Quantitative results show ResNet50+ATFE and SignNet outperform baseline models, with F1-scores and overall performance approaching practitioner levels, though color- and lighting-sensitive attributes remain challenging due to data imbalance. The publicly released TongueDx dataset and the proposed framework hold promise for robust, scalable remote tongue diagnostics in telemedicine, and the authors outline concrete avenues (bounding boxes, video data, color correction) to further improve reliability and clinical utility.
Abstract
Tongue diagnosis is a vital tool in Western and Traditional Chinese Medicine, providing key insights into a patient's health by analyzing tongue attributes. The COVID-19 pandemic has heightened the need for accurate remote medical assessments, emphasizing the importance of precise tongue attribute recognition via telehealth. To address this, we propose a Sign-Oriented multi-label Attributes Detection framework. Our approach begins with an adaptive tongue feature extraction module that standardizes tongue images and mitigates environmental factors. This is followed by a Sign-oriented Network (SignNet) that identifies specific tongue attributes, emulating the diagnostic process of experienced practitioners and enabling comprehensive health evaluations. To validate our methodology, we developed an extensive tongue image dataset specifically designed for telemedicine. Unlike existing datasets, ours is tailored for remote diagnosis, with a comprehensive set of attribute labels. This dataset will be openly available, providing a valuable resource for research. Initial tests have shown improved accuracy in detecting various tongue attributes, highlighting our framework's potential as an essential tool for remote medical assessments.
