Country / Region
EMEA
Tags
Clinical Practice, Global/International, Mapping, Translation
The Graz Interface Terminology for SNOMED CT” (SNO_GIT) is a huge vocabulary that links German-language clinical jargon to SNOMED CT codes, using n-gram extraction, machine translation, linguistic rules, and manual curation. We report about the evolution of this resource - first presented at EXPO 2015 - during the past ten years and describe its use in a clinical coding scenario.
Description
The Graz Interface Terminology for SNOMED CT (SNO_GIT) is an ongoing activity that aims to develop a German-language user interface terminology aligned with the international version of SNOMED CT. SNO_GIT is designed to serve as a resource-efficient, scalable, and pragmatic foundation for integrating SNOMED CT into German-language clinical information systems. It supports both entity linkage for clinical text mining and the provision of a comprehensive index for term retrieval in manual coding scenarios.
The creation of SNO_GIT started in 2013 at the Medical University of Graz (Austria), with the goal of covering all English-language SNOMED CT terms linked to active concepts. These source terms from the English Descriptions table are first chunked into noun phrases and then broken down into token n-grams, i.e. sequences of one to five words. These units are systematically translated into German, initially through Web translation services and subsequently refined through manual curation. This core vocabulary is then enhanced by tags, specifying parts of speech and noun inflection classes. Special attention is given to the linguistic particularities of the German language, including single-word-compound formation, noun inflexions and word order. Increasingly, synonyms from authentic clinical corpora have been added, many of which are short forms such as acronyms.
The development process is supported by numerous Python scripts for variant generation and editorial assistance. Manual review, mostly done by medical students, produces a growing list of negative patterns to filter out atypical or unsuitable term candidates. The remaining candidates are ranked based on the frequency of their constituent token sequences in German reference corpora.
The resulting large-scale interface terminology is further customised for specific use cases such as manual coding support and clinical text mining. Cut-off points are set regarding the maximal number of synonyms per SNOMED code and a minimum score value. Depending on the application, the number of terms ranges from two to ten million.
Scope
SNOMED CT was chosen for its conceptual depth, international distribution, and machine-readable structure. The described use case supports the precise and interoperable representation of outpatient records, a first step towards the implementation of a fully structured International Patient Summary (IPS).
Thus, the presented project constitutes the by far largest SNOMED CT implementation in a German-speaking country.
How SNOMED CT will be used
ELGA GmbH ‚ the Austrian SNOMED CT National Release Center has recently established a collaboration with the developers of SNO_GIT with the goal to provide this resource for speeding up coding of diagnoses in the outpatient sector, with the prospect of expansion to other sectors of the healthcare system and to other types of relevant information such as clinical procedures.
SNO_GIT will thus be used as a huge index vocabulary for the e-Health coding service developed and operated by the Austrian Ministry of Labour, Social Affairs, and Consumer Protection and rolled out to routine use in outpatient clinics in 2026.
Constant adaptations to an ever evolving clinical terminology will require long-term collaboration and the increasing use of AI technology for content maintenance and quality assurance.
Why SNOMED CT will be used
Contact


