top of page
Data Quality Management of Standard Terminology Master Table for Multi-Institutional Research: Focusing on Validity, Completeness, and Timeliness

Kakao Healthcare (5 of 5)

Data Quality Management of Standard Terminology Master Table for Multi-Institutional Research: Focusing on Validity, Completeness, and Timeliness

Country / Region
APAC
Tags
Data quality, Mapping

This project outlines the processes for managing and maintaining standard terminology master table for multi-institutional clinical terminology standardization. As the size of the standard terminology master table increases and includes various vocabularies for multi-institutional research, managing data quality becomes crucial. Kakao Healthcare developed an extended model called UDM based on OMOP CDM to support broader clinical coverage. The standard terminologies used for mapping clinical terms are managed using the concept ID and concept code from the OMOP CDM vocabulary table. The concept ID serves as a unique identifier for each concept across all vocabularies within the OMOP CDM vocabulary, while the concept code is the actual standard code used to identify the concept within its respective vocabulary. For newly created SNOMED CT concepts and standard terminologies not included in the OMOP CDM vocabulary managed by Kakao Healthcare, separate concept IDs(e.g., "udm_12345") are generated and maintained internally.

To ensure the efficient maintenance of the standard terminology master table, this document outlines the management processes in response to standard terminology updates. It describes how timeliness, data validity, and completeness are addressed in relation to updates across various vocabularies.

Through these processes, we aim to minimize manual errors and maintain up-to-date terminology by regularly incorporating updates to standard vocabularies. Additionally, by effectively managing SNOMED CT concepts - which account for the highest proportion of mappings in Kakao Healthcare's standardized terminology - we expect to improve the consistency and accuracy of clinical terminology mapping across multiple institutions.

Description

This study describes the data quality management process for the standard terminology master table in a multicenter research setting, with a focus on three key aspects: validity, completeness, and timeliness.

1. Timeliness

This is the procedure for maintaining data timeliness by reflecting version updates of SNOMED CT, the OMOP CDM vocabulary table, and Terms outside the scope of the OMOP CDM vocabulary used in terminology mapping. For SNOMED CT, updates are applied after the completion of the full clinical terminology mapping for an institution and the release of the KHC edition. In the case of OMOP CDM, updates are performed once a year. To prevent version conflicts during updates, each vocabulary is updated separately.

1.1. Apply updates from the release of the SNOMED CT Kakao healthcare edition

1.2. Reflect changes resulting from updates to the OMOP CDM version

2. Validity
This is the procedure to ensure the validity of standard vocabularies used in mapping.

2.1. Ensure that valid vocabularies are used for clinical term mappings.

2.2. Validate data types and values in each column of the standard terminology master table stored in the database.

3. Completeness

This is the procedure for verifying whether all clinical terms essential for interoperability in multi-institutional research have been mapped without omissions.

3.1. After completing terminology mapping for an institution, verify whether any terms were omitted.

3.2. For not mapped terms, check whether they are appropriately mapped to designated codes based on their eligibility for standardization. Kakao Healthcare categorizes HL7 codes into invalid, no information, not available, and temporarily unavailable based on the mapping target and mapping availability.

Scope

The reasons for using SNOMED CT as a core component in the terminology master are as follows:
1. Systematic Version Control and Inactive Concept Management:
SNOMED CT is a standard terminology system that is systematically maintained across versions. Each release includes detailed records of changes, including information on concepts that have become inactive. By reflecting these updates, it is possible to ensure the accuracy and currency of the data. Identifying inactive concepts and mapping them to appropriate replacements helps improve consistency and reduce errors in the data.

2. Utilization of Historical Association Reference Sets:
SNOMED CT provides Association Reference Set as well as Historical Association Reference Sets that help identify concepts with identical or similar meanings to those that have been inactivated. This allows the semantic integrity of clinical terminology to be maintained. Rather than manually mapping inactive concepts again, this approach ensures that their meanings are preserved by linking them to suitable alternatives, preventing information loss and maintaining the semantic accuracy of the mapped standard terms.

SNOMED CT offers robust version control and detailed information about inactive concepts. Through the use of Association Reference Set and Historical Association Reference Sets, it supports semantic continuity. These strengths make SNOMED CT a key component in Kakao Healthcare's terminology standardization process, enabling efficient data standardization and high-quality data management.

How SNOMED CT will be used

Given the high proportion of SNOMED CT usage in Kakao Healthcare's standardized terminology (accounting for approximately 30% of all mappings - 291,466 out of 986,903 local codes), SNOMED CT must be considered a core component during terminology updates. Regular maintenance of this terminology is essential to support multi-institutional research and ensure data interoperability. By actively utilizing SNOMED CT concepts during the mapping process, version control and historical tracking can be managed efficiently, ultimately improving consistency across institutions.

The following summarizes both the procedures implemented to manage SNOMED CT and key outcomes observed during the update process:
1. Timeliness
1.1 SNOMED CT Updates
When the SNOMED CT KHC(Kakaohealthcare) edition is updated to align with the international release:
* Among 3,655 pre-coordinated concepts that became inactive(68%), 2,491 were updated using the Historical Association Reference Sets. We referred to the 900000000000522004 |Historical association| from the Historical Association Reference Sets and applied the "same as" and "replaced by" relationships collectively without additional validation by the mapper. We also referred to 900000000000481005 |Concept inactivation value (foundation metadata concept)| to understand the reasons for inactivation.
* 778 out of 56,469 SNOMED CT concepts used in the terminology were updated accordingly.
* About 800 internally created (post-coordinated) concepts - modeled using SNOMED CT concepts that later became inactive - were flagged for revision and institutional-level updates.

1.2 CDM Table Updates
When the OMOP CDM vocabulary version is updated (otal of 83,256 codes used for terminology mapping) :
* 1,486 concepts were replaced with new standard codes and mapped accordingly using updated concept IDs.
* In 859 cases, where codes became non-standard, new internal concept IDs were assigned for continued use and consistency in mapping.

2. Validity
To ensure data integrity throughout the terminology management process:
* Only valid vocabularies are allowed for mapping.
* Column-level validation of the terminology management table is performed regularly. (e.g., SNOMED CT IDs are numeric, between 6 and 18 digits, with no special characters)

3. Completeness
We ensure full coverage of clinical terms and maintain terminology completeness through the following:
* After completing mapping for each institution, we verify that no local codes were missed, especially between mapping finalization and service rollout.
* Continuously monitoring the proportion of "not mapped" codes among all mapped concepts.

Why SNOMED CT will be used

Contact

More information

Get SNOMED CT

Get SNOMED CT

Information about our license and fee structure

Learn more

Learn more

Explore the wide range of resources available to our community of practice

Subscribe to SNOMED International news

Stay up to date on SNOMED news, features, developments and newsletters by subscribing to our news service.

bottom of page