top of page
NLP>OMOP: Transforming Clinical Narratives into Actionable Data

ibis.ai

NLP>OMOP: Transforming Clinical Narratives into Actionable Data

Country / Region
EMEA
Tags
Artificial intelligence, Collaboration, Data quality

To tackle the challenge of transforming free text into an OMOP-compatible format, we developed dedicated SNOMED-aware clinical language models using open-source Large Language Models (LLMs). Data privacy is safeguarded through on-premise training and deployment.

Additionally, a SNOMED Query Builder (SnoQB) was built to help researchers curate and manage SNOMED CT concept sets for querying patient data. These queries are matched against a vector database of patient data built using the clinical language models. Extraction results are stored in the OMOP model's note_nlp table, thereby enabling flexible down-stream analytics. Harmonising these OMOP models across the participating hospitals facilitates data exchange, and positions them to participate in real-world evidence (RWE) studies.

The entire pipeline was developed and implemented during a proof-of-concept study across a consortium of Belgian hospitals (AZ Klina, AZorg, AZ Oostende and AZ Delta), led by AZ Klina. In a follow-up validation project (PROZA), the tools developed in this project were validated with a real-world use case focused on the automated screening of osteoporosis indicators in patients‚ medical history, showcasing the successful integration of structured and unstructured data from diverse healthcare sources. This project was funded by the Belgian federal authorities‚ Data Capabilities initiative.

Description

The scope of this project centers on integrating Natural Language Processing (NLP) with the OMOP Common Data Model (CDM) to convert unstructured clinical narratives into structured, actionable data through the use of SNOMED CT.

Scope

SNOMED CT was chosen for the NLP>OMOP project due to its comprehensive nature and its status as a global standard for clinical terminology. It effectively handles the wide array of concepts found in unstructured clinical narratives, enabling precise mapping and consistent representation of data. This supports interoperability between healthcare systems, facilitating data sharing across different institutions. SNOMED CT also allows for complex queries, aiding in detailed research and analysis through tools like the SNOMED Query Builder (SnoQB).

How SNOMED CT will be used

In the NLP>OMOP project, SNOMED CT plays a central role in standardizing and structuring clinical data extracted from unstructured text, such as patient reports. The project develops clinical language models, specifically tailored to recognize and map clinical narratives to SNOMED CT concepts. This ensures that the transformation of free-text data into structured, consistent representations aligns with a globally recognized terminology system, facilitating interoperability and data exchange.

Furthermore, the project includes the development of the SNOMED Query Builder (SnoQB), which allows researchers to define and manage SNOMED CT concept sets relevant to their specific research interests. These SNOMED CT concept sets enable precise querying of patient data, effectively translating complex clinical language into a standardized format that can be systematically analyzed and stored in OMOP.

Why SNOMED CT will be used

Contact

More information

Get SNOMED CT

Get SNOMED CT

Information about our license and fee structure

Learn more

Learn more

Explore the wide range of resources available to our community of practice

Subscribe to SNOMED International news

Stay up to date on SNOMED news, features, developments and newsletters by subscribing to our news service.

bottom of page