Document Type

Article

Publication Date

11-18-2025

Comments

This article is the author’s final published version in International Medical Education, Volume 4, Issue 4, 2025, Article number 49.

The published version is available at https://doi.org/10.3390/ime4040049. Copyright © 2025 by the authors.

 

Abstract

Residents preparing for pathology board exams frequently use multiple-choice questions (MCQs) from question banks (QBs) like PathDojo and PathPrimer, which can be costly. ChatGPT, a free tool, has been used to generate MCQs for other tests like the SAT. This study compared the quality of pathology MCQs created by ChatGPT versus commercially available study questions for the American Board of Pathology’s (ABPath) certifying exams. A rubric adapted from the National Board of Medical Examiners’ (NBME) question writing guide was validated by two pathologists using commercially available pathology board exam questions. This rubric was then used to evaluate MCQs from commercially available pathology board study books as well as MCQs created by ChatGPT. The results compared the percentage of criteria met between ChatGPT and control MCQs using chi-square analysis with significance set at < 0.05. While ChatGPT MCQs were less likely to be accurate compared to commercially available MCQs in four criteria (the best answer choice (82.5% vs. 100%), reflection of current practice (84.6% vs. 100%), error-free explanation (87.9% vs. 100%), and explanation reflecting current practice (87.9% vs. 100%)), the complexity of the ChatGPT-generated questions was higher (78.5% vs. 47.2%). At this time, ChatGPT-generated MCQs should not be used in the same way as commercially available study guides, however there is potential for learned language models (LLM)s to create quality study materials and exam questions with careful monitoring.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Language

English

Share

COinS