DISCLAIMERS

contact us >>

Do You See What I See? An ICG Pattern Recognition Comparison Between ChatGPT and Lymphatic Surgeons

Alexis M. Henderson, MPH; Chanel L. Reid, MD; Meeti Mehta, BS; Viraj N. Govani, BA; Shayan M. Sarrami, MD; Carolyn De La Cruz, MD
University of Pittsburgh School of Medicine
2025-01-10

Presenter: Alexis M. Henderson

Affidavit:
Vu T. Nguyen, MD

Director Name: Vu T. Nguyen, MD

Author Category: Medical Student
Presentation Category: Clinical
Abstract Category: Breast (Aesthetic and Recon.)

Purpose: Artificial intelligence (AI) is increasingly useful in medicine, helping practitioners recognize patterns in patients. Previous research has explored AI's role in staging indocyanine green (ICG) lymphography patterns; none have utilized ChatGPT. This study evaluates ChatGPT's ability to differentiate between varying severities of dermal backflow using ICG lymphography, offering an accessible diagnostic tool for lymphatic surgeons and therapists.
Methods: The dataset used comprised of 40 upper extremity ICG lymphographs diagnosed as diffuse, linear, splash, or stardust by a lymphatic surgeon. Three reviewers independently used ChatGPT to classify the ICG images. Cohen's Kappa statistics were used to compare inter-rater reliability between the ChatGPT responses and the professional diagnoses.
Results: Overall, ChatGPT was 93% accurate in recognizing dermal backflow. Inter-rater reliability between the reviewer responses indicated fair agreement between the three groups (κ=0.24 (p<0.001) and κ=0.32 (p<0.001). After combining reviewers' ratings into a composite ChatGPT group, we compared ChatGPT responses those of the lymphatic surgeon. ChatGPT exhibited a fair agreement with the reference staging of lymphatic patterns with a Cohen's kappa of 0.4 (p<0.001). ChatGPT's accuracy in identifying lymphatic patterns was 80% for linear, 50% for splash and stardust, and 40% for diffuse.
Conclusion: ChatGPT demonstrated fair levels of agreement with a lymphatic surgeon's staging of various dermal backflow patterns. It was most accurate in identifying linear flow patterns but showed promise in identifying more severe forms of lymphatic disruption. With increased utility, this learning model may become a useful tool to aid in diagnostic imaging.

Ohio,Pennsylvania,West Virginia,Indiana,Kentucky,Pennsylvania American Society of Plastic Surgeons

OVSPS Conference