ShennongAlpha: an AI-driven sharing and collaboration platform for intelligent curation, acquisition, and translation of natural medicinal material knowledge
Zijie Yang, Yongjing Yin, Chaojun Kong, Tiange Chi, Wufan Tao, Yue Zhang, Tian Xu
TL;DR
ShennongAlpha presents an AI-driven platform to tackle the lack of standardized nomenclature, curation, and translation for Natural Medicinal Materials (NMMs) by introducing Systematic Nomenclature (NMMSN) and a bilingual, collaborative knowledge base. The system integrates an open naming tool (ShennongName), multilingual knowledge management (MLMD), a retrieval-augmented chat interface (ShennongChat), and a standardized translation pipeline (NMT-CPT) to enable accurate, interpretable cross-language access to >$14{,}000$ NMM entries. Key innovations include the NMMSN encoding with a 4-digit base-36 ID ($36^4 - 1 = 1{,}679{,}615$ max entries), a five-layer ShennongAlpha architecture, and a CGS-driven coreference graph search for consistent term mapping. The approach demonstrates how standardized nomenclature, robust search, and retrieval-augmented generation can transform domain-specific knowledge sharing, reduce mistranslations, and broaden global access to NMM knowledge for researchers, clinicians, and patients. This work offers a scalable model for AI-assisted knowledge sharing in specialized biomedical domains and provides tools and data to support future LLM training and cross-cultural dissemination of medicinal knowledge.
Abstract
Natural Medicinal Materials (NMMs) have a long history of global clinical applications and a wealth of records and knowledge. Although NMMs are a major source for drug discovery and clinical application, the utilization and sharing of NMM knowledge face crucial challenges, including the standardized description of critical information, efficient curation and acquisition, and language barriers. To address these, we developed ShennongAlpha, an AI-driven sharing and collaboration platform for intelligent knowledge curation, acquisition, and translation. For standardized knowledge curation, the platform introduced a Systematic Nomenclature to enable accurate differentiation and identification of NMMs. More than fourteen thousand Chinese NMMs have been curated into the platform along with their knowledge. Furthermore, the platform pioneered chat-based knowledge acquisition, standardized machine translation, and collaborative knowledge updating. Together, our study represents the first major advance in leveraging AI to empower NMM knowledge sharing, which not only marks a novel application of AI for Science, but also will significantly benefit the global biomedical, pharmaceutical, physician, and patient communities.
