Description
Undergraduate medical education (UME) encounters several challenges, including the need for adequate practice items replicating standardized medical exams like the United States Medical Licensing Examination (USMLE).1 This demand suggests a role for innovative, efficient approaches to item generation. Artificial intelligence (AI) large language models (LLM), like those employed by ChatGPT, present an attractive solution.
Previous authors have investigated ChatGPT’s ability to “pass” high-stakes assessments, such as the USMLE,2–4 the ophthalmology and radiology board examinations,5,6 and other nations’ certification examinations.7 Less literature has been published on ChatGPT’s ability to construct vignette-based single best answer multiple choice items similar to those employed by these assessments,8–10 and these studies employ broad categories of item flaws and scant comparative psychometric analysis of item performance.
This study investigated the utility and feasibility of ChatGPT as an author of USMLE-style questions, with the following research questions:
- Once fine-tuned, can ChatGPT successfully generate factually accurate questions that adhere to predetermined style and content guidelines?
- How efficient is ChatGPT at writing questions, compared to human subject matter experts?
- Do the psychometric characteristics of ChatGPT’s items differ from human-written ones?
Publication Date
3-30-2025
Keywords
artificial intelligence, AI, ChatGPT, assessment, USMLE
Disciplines
Medical Education | Medicine and Health Sciences | Pathology
Recommended Citation
Macnow, MD, MHPE, Alexander, "“Coherent Nonsense”: Lessons Learned from Utilizing ChatGPT for USMLE-Style Anatomy and Pathology Questions" (2025). Department of Medical Education Posters. 2.
https://jdc.jefferson.edu/medicaledposters/2
Comments
Presented at the Anatomy Connected 2025 (American Association for Anatomy, AAA).