Document Type
Article
Publication Date
1-31-2026
Abstract
BACKGROUND: The management of head and neck cancer relies on multidisciplinary expertise; however, access to tumor boards remains variable. Large language models (LLMs) may support guideline-based decision-making, although performance in complex oncologic scenarios is not well defined.
METHODS: Fourteen synthetic cases based on real tumor board encounters were evaluated. Five blinded comparator arms produced recommendations: a human expert, Non-RAG-GPT-4, Non-RAG-GPT-5, RAG-GPT-4, and RAG-GPT-5. Eight head and neck oncologic surgeons scored each recommendation for appropriateness, clarity, specificity, and feasibility using 5-point Likert scales. Paired permutation testing and inter-rater reliability were assessed.
RESULTS: LLM outputs showed close alignment with expert recommendations. RAG-based models achieved the highest mean scores across domains, with some statistically significant differences versus the expert comparator in appropriateness and clarity; however, absolute differences were modest. Inter-rater reliability was strong (ICC 0.73-0.87).
CONCLUSIONS: Advanced LLMs can generate guideline-concordant management recommendations in simulated head and neck cancer cases, supporting potential utility for decision support and education; prospective validation and expert oversight remain essential.
Recommended Citation
Hack, Sholem; Karni, Ron J.; Maniaci, Antonino; Fundakowski, Christopher E.; Castellani, Luca; Incandela, Fabiola; Accorona, Remo; Mayo-Yanez, Miguel; Violati, Martina; Giannini, Lorenzo; Mevio, Niccolo'; and Saibene, Alberto Maria, "Evaluation of Large Language Models as Decision Support Tools for Head and Neck Cancer Management: A Blinded Multidisciplinary Simulation Study" (2026). Department of Otolaryngology - Head and Neck Surgery Faculty Papers. Paper 107.
https://jdc.jefferson.edu/otofp/107
Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.
PubMed ID
41621281
Language
English
Included in
Artificial Intelligence and Robotics Commons, Biomedical Informatics Commons, Neoplasms Commons, Otolaryngology Commons

Comments
This article is the author’s final published version in Oral Oncology, Volume 174, 2026, Article number 107877.
The published version is available at https://doi.org/10.1016/j.oraloncology.2026.107877. Copyright © 2026 The Authors.