top of page
Extraction and Transformation of Pathology Cancer Report Data into a SNOMED CT Encoded Data Repository

University of Nebraska Medical Center

Extraction and Transformation of Pathology Cancer Report Data into a SNOMED CT Encoded Data Repository

Country / Region
Americas
Tags
Data quality, Implementation, Mapping, Research

University of Nebraska Medical Center pathology cancer reports have been recorded and stored as Portable Document Format (PDF) in the Electronic Health Records (EHR). Therefore, the data held within these reports is not readily available for use in research or patient care decision making. To rectify it, we developed an approach to extract and SNOMED CT encode large amounts of historical cancer pathology data and insert these data into a Clinical Data Warehouse (CDW) to support clinical research endeavors and population health efforts.

A five-step process was applied to five cancer types: ampulla of Vater, pancreas, esophagus, lung, and prostate.1) PDF pathology reports were generated from a pathology information system; 2) The system's user interface for synoptic reporting was evaluated, all possible text strings were identified, and bindings of controlled text strings and SNOMED CT was created; 3) A python-based program was used to convert the PDF reports to plain text and extract synoptic text strings; 4) Free-text data were bound with SNOMED CT; 5) SNOMED CT encoded cancer data were integrated into the CDW.

This project demonstrated a tractable method to extract and transform PDF-based pathology cancer data from EHR to SNOMED CT encoded data to support of research and public health. Automated data conversion from text/PDF to SNOMED CT is possible when structured text is used to create initial reports. Free-text or non-standard, unstructured text were manual, case-by-case reviewed. Following this, multiple years of historical pathology cancer data was encoded and made available for multiple future uses.

Description

It is still common for pathology reports to be stored and distributed as Portable Document Format (PDF). While this format helps user easily read the information, it causes a barrier to computation, secondary use, or electronic data exchange. That is an important issue because pathology reports contain crucial information for patient care and treatment planning.

To address this, we initiated an ongoing project composed of three parts: (1) data extraction, (2) data encoding using SNOMED CT, and (3) integration into the Clinical Data Warehouse (CDW). This abstract will present findings from the process based on five cancer types: ampulla of Vater, pancreas, esophagus, lung, and prostate.

Scope

SNOMED CT was selected as it is currently the sole international medical terminology standard capable of representing pathology cancer data as found in pathology reports. In addition, SNOMED CT is a terminology supported for use in national and international research oriented Clinical Data Warehouses (CDW) including OMOP and PCORI. By performing this study, a practical, tractable process to extract uncoded pathology data from the electronic health record and convert to encoded, structured data for use in CDW has been demonstrated for widespread use.

How SNOMED CT will be used

In the encoding process SNOMED CT was used to bind (map) the extracted data from Portable Document Format (PDF). Specifically, terminology binding between the extracted data and SNOMED CT was conducted in two parts: (1) batch binding based on a predefined master file based on the synoptic report format, and (2) manual binding for free-text elements. The project follows the pathology cancer date methods proposed by the Cancer Synoptic Reporting Working Group (CSRWG) of SNOMED International and described in the Cancer Synoptic Reporting Implementation Guide.

Why SNOMED CT will be used

Contact

More information

Get SNOMED CT

Get SNOMED CT

Information about our license and fee structure

Learn more

Learn more

Explore the wide range of resources available to our community of practice

Subscribe to SNOMED International news

Stay up to date on SNOMED news, features, developments and newsletters by subscribing to our news service.

bottom of page