top of page
Automatic Standardization of Chinese Clinical Terms to SNOMED CT Based on Large Language Model Technology

Digital Health China

Automatic Standardization of Chinese Clinical Terms to SNOMED CT Based on Large Language Model Technology

Country / Region
APAC
Tags
Global/International, Innovation, Mapping, Translation

This study presents a method for the automatic standardization of Chinese clinical terms to SNOMED CT using large language model (LLM) technology. To address the challenges of mapping Chinese medical terminology to standardized international vocabularies, the research constructs a dataset covering diverse clinical terms and fine-tunes a pretrained large model (DeepSeek-V3) to improve the accuracy and efficiency of term mapping. Experimental results demonstrate that the LLM-based approach significantly outperforms traditional rule-based or dictionary-based methods. It effectively handles complex cases such as polysemy and term variations, while also supporting contextual understanding to ensure precise term alignment. This technology has the potential to enhance clinical data integration and interoperability, providing standardized terminology support for intelligent healthcare systems and medical research applications.

Description

1.Data Level:

Scope

SNOMED CT is employed as the standardized medical terminology system to establish a unified and structured mapping between Chinese clinical terms and international medical knowledge frameworks. Key uses include:

1.Term Alignment Goal:

Automatically maps unstructured Chinese clinical terms—such as disease names and symptom descriptions—to the corresponding standard terms or concept IDs in SNOMED CT.

2.Reference System for Matching:

SNOMED CT serves as the reference system for validating whether Chinese terms have been correctly standardized and for evaluating mapping outcomes.

3.Training and Evaluation Benchmark:

The outputs of the large language model are compared against SNOMED CT concepts to assess semantic accuracy and normalization effectiveness.

In summary, SNOMED CT is used as the target system for standardization, providing structured coding, unified semantic representation, and enabling interoperability of medical information.

How SNOMED CT will be used

SNOMED CT was chosen as the target standard vocabulary for the following key reasons:

1.International Authority:

SNOMED CT is one of the most comprehensive and well-structured medical terminology systems globally, widely adopted in clinical information systems and electronic health records.

2.Comprehensive Coverage:

It includes a wide range of medical concepts—such as diseases, symptoms, procedures, and diagnostics—making it suitable for mapping common expressions in Chinese clinical text.

3.Semantic Interoperability:

Using SNOMED CT facilitates semantic interoperability between Chinese clinical data and global medical information systems, enhancing data sharing and integration.

4.Support for Standardization and Global Integration:

Aligning with SNOMED CT promotes the standardization of Chinese healthcare data and its integration into the international medical ecosystem.

Therefore, SNOMED CT was selected to achieve high-quality, internationally compatible terminology normalization and to improve the global applicability of Chinese clinical terms.

Why SNOMED CT will be used

Contact

More information

Get SNOMED CT

Get SNOMED CT

Information about our license and fee structure

Learn more

Learn more

Explore the wide range of resources available to our community of practice

Subscribe to SNOMED International news

Stay up to date on SNOMED news, features, developments and newsletters by subscribing to our news service.

bottom of page