SKMC Student Presentations and Publications

Biomedical Text Readability After Hypernym Substitution with Fine-Tuned Large Language Models

Karl Swanson
Shuhan He
Josh Calvano
David Chen
Talar Telvizian
Lawrence Jiang
Paul Chong
Jacob Schwell, Thomas Jefferson UniversityFollow
Gin Mak
Jarone Lee

Document Type

Article

Publication Date

4-16-2024

Comments

This article is the author's final published version in PLoS Digital Health, Volume 3, Issue 4, April 2024, Article number e0000489.

The published version is available at https://doi.org/10.1371/journal.pdig.0000489 .

Abstract

The advent of patient access to complex medical information online has highlighted the need for simplification of biomedical text to improve patient understanding and engagement in taking ownership of their health. However, comprehension of biomedical text remains a difficult task due to the need for domain-specific expertise. We aimed to study the simplification of biomedical text via large language models (LLMs) commonly used for general natural language processing tasks involve text comprehension, summarization, generation, and prediction of new text from prompts. Specifically, we finetuned three variants of large language models to perform substitutions of complex words and word phrases in biomedical text with a related hypernym. The output of the text substitution process using LLMs was evaluated by comparing the pre- and post-substitution texts using four readability metrics and two measures of sentence complexity. A sample of 1,000 biomedical definitions in the National Library of Medicine's Unified Medical Language System (UMLS) was processed with three LLM approaches, and each showed an improvement in readability and sentence complexity after hypernym substitution. Readability scores were translated from a pre-processed collegiate reading level to a post-processed US high-school level. Comparison between the three LLMs showed that the GPT-J-6b approach had the best improvement in measures of sentence complexity. This study demonstrates the merit of hypernym substitution to improve readability of complex biomedical text for the public and highlights the use case for fine-tuning open-access large language models for biomedical natural language processing.

Recommended Citation

Swanson, Karl; He, Shuhan; Calvano, Josh; Chen, David; Telvizian, Talar; Jiang, Lawrence; Chong, Paul; Schwell, Jacob; Mak, Gin; and Lee, Jarone, "Biomedical Text Readability After Hypernym Substitution with Fine-Tuned Large Language Models" (2024). SKMC Student Presentations and Publications. Paper 13.
https://jdc.jefferson.edu/skmcstudentworks/13

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 License.

Language

English

Download

Find in your library

Included in

Medicine and Health Sciences Commons, Psycholinguistics and Neurolinguistics Commons, Psychology Commons

COinS

SKMC Student Presentations and Publications

Biomedical Text Readability After Hypernym Substitution with Fine-Tuned Large Language Models

Document Type

Publication Date

Comments

Abstract

Recommended Citation

Creative Commons License

Language

Included in

Browse

Search

Author Corner

About the JDC

Links

SKMC Student Presentations and Publications

Biomedical Text Readability After Hypernym Substitution with Fine-Tuned Large Language Models

Authors

Document Type

Publication Date

Comments

Abstract

Recommended Citation

Creative Commons License

Language

Included in

Share

Browse

Search

Author Corner

About the JDC

Links