Artificial Intelligence During Barrett’s Endoscopy Is Accurate for Reporting Prague Criteria
Prateek Sharma, MD, FASGE, reviewing Ali S, et al. Gastroenterology 2021 Sep.
During endoscopy, the extent of Barrett’s esophagus (BE) is quantified by endoscopists using the widely accepted Prague C&M (circumferential and maximal lengths) criteria. Prague C&M scores can be used to assess the risk of progression but are subjective to endoscopists and their level of experience.
This single-center pilot study evaluated a novel artificial intelligence (AI)-assisted, automatic, 3-dimensional (3D) system for measuring BE depth and length using the Prague C&M classification system. Initially, an advanced deep learning model was developed by mapping 2-dimensional real-world endoscopy images, which were compared to a 3D phantom esophagus derived from CT and endoscopic images. Measurements of C and M lengths taken during endoscopy were then compared to those of the phantom esophagus ground-truth. The AI system was trained on 10,000 simulated images and validated using 2000 images (20%) for approximately 1000 epochs (100 hours).
The automatic quantification of the C and M extent had an accuracy of 97.2% (2.8% relative error) and an average deviation of 0.9 mm from the ground-truth measurements. The root mean square error was 1.2 mm, confirming substantial agreement with the ground-truth (Cohen’s kappa [k]=0.72 and Spearman correlation [rs]= 0.99). C and M measurements made by endoscopists during real-time patient endoscopy in 131 patients were compared to the automated measurement system.
This endoscopy patient cohort comprised 3 subgroups: newly diagnosed BE patients (n=68), BE patients undergoing follow-up without endoscopic therapy (n= 24), and patients with comparison of pre-endoscopic and postendoscopic therapy visits (n= 39). For all patients, the overall relative errors (mean difference) were 8% (3.6 mm) and 7% (2.8 mm) for C and M values, respectively. The highest relative errors were experienced in those patients with BE lengths of 0 to 1 cm. Near-perfect interrater reliability was found between the AI system and endoscopists (k=0.84 and rs=0.99 for C; k=0.87 and rs=0.99 for M).
Note to readers: At the time we reviewed this paper, its publisher noted that it was not in final form and that subsequent changes might be made.
Ali S, Bailey A, Ash S, et al. A pilot study on automatic three-dimensional quantification of Barrett’s esophagus for risk stratification and therapy monitoring. Gastroenterology 2021;161:865-878.e8. (https://doi.org/10.1053/j.gastro.2021.05.059)