Optical Character Recognition Combined With Natural Language Processing to Improve Extraction of Quality Parameters From Colonoscopy Reports

Optical Character Recognition Combined With Natural Language Processing to Improve Extraction of Quality Parameters From Colonoscopy Reports

Douglas K. Rex, MD, MASGE, reviewing Laique SN, et al. Gastrointest Endosc 2020 Sep 3.

Measurement and improvement of colonoscopy quality measures, such as the adenoma detection rate (ADR), are critical to optimizing colorectal cancer prevention by colonoscopy. The process of measuring ADR still frequently requires manual data extraction, which is time-consuming and expensive. One method of automating quality measurement is natural language processing (NLP), which has already proven effective but requires machine-readable clinical text and does not work with print or scanned documents. Optical character recognition (OCR) enables conversion of scanned paper documents into editable and searchable text data. In addition to accurately extracting data on polyp and adenoma detection, bowel preparation quality, and success of cecal intubation, the hybrid OCR/NLP technology had 100% accuracy in evaluating quality endpoints (compared to manual annotation) from 30 colonoscopy procedures performed at a different institution and using a different endoscopy writing software.

Douglas K. Rex, MD, FASGE

COMMENT

In the discussion section of the paper, the authors indicate that the programmer who developed the current program invested about 150 man-hours, and the algorithm can now extract data in under 30 minutes on all colonoscopy procedures ever done at the institution since the introduction of electronic medical records. By contrast, the two authors who manually extracted data took 6 to 8 minutes per patient, equating to 160 man-hours to annotate data from less than 600 patients. 

The novelty here is the use of OCR technology, which would allow health care organizations scanning procedure reports into electronic health records to use NLP for quality measurements. In tertiary centers, the use of OCR on prior procedures from referring physicians could facilitate clinical care and research.

Note to readers: At the time we reviewed this paper, its publisher noted that it was not in final form and that subsequent changes might be made.

CITATION(S)

Laique SN, Hayat U, Sarvepalli S, et al. Application of optical character recognition with natural language processing for large-scale quality metric data extraction in colonoscopy reports. Gastrointest Endosc 2020 Sep 3. (Epub ahead of print) (https://doi.org/10.1016/j.gie.2020.08.038)

 

Nach oben scrollen