Please use this identifier to cite or link to this item:
PIRA download icon_1.1View/Download Full Text
Title: Use of machine learning algorithms to predict the understandability of health education materials : development and evaluation study
Authors: Ji, M
Liu, Y
Zhao, M
Lyu, Z
Zhang, B
Luo, X 
Li, Y 
Zhong, Y 
Issue Date: May-2021
Source: JMIR medical informatics, 6 May 2021, v. 9, no. 5, e28413
Abstract: Background: Improving the understandability of health information can significantly increase the cost-effectiveness and efficiency of health education programs for vulnerable populations. There is a pressing need to develop clinically informed computerized tools to enable rapid, reliable assessment of the linguistic understandability of specialized health and medical education resources. This paper fills a critical gap in current patient-oriented health resource development, which requires reliable and accurate evaluation instruments to increase the efficiency and cost-effectiveness of health education resource evaluation.
Objective: We aimed to translate internationally endorsed clinical guidelines to machine learning algorithms to facilitate the evaluation of the understandability of health resources for international students at Australian universities.
Methods: Based on international patient health resource assessment guidelines, we developed machine learning algorithms to predict the linguistic understandability of health texts for Australian college students (aged 25-30 years) from non-English speaking backgrounds. We compared extreme gradient boosting, random forest, neural networks, and C5.0 decision tree for automated health information understandability evaluation. The 5 machine learning models achieved statistically better results compared to the baseline logistic regression model. We also evaluated the impact of each linguistic feature on the performance of each of the 5 models.
Results: We found that information evidentness, relevance to educational purposes, and logical sequence were consistently more important than numeracy skills and medical knowledge when assessing the linguistic understandability of health education resources for international tertiary students with adequate English skills (International English Language Testing System mean score 6.5) and high health literacy (mean 16.5 in the Short Assessment of Health Literacy-English test). Our results challenge the traditional views that lack of medical knowledge and numerical skills constituted the barriers to the understanding of health educational materials.
Conclusions: Machine learning algorithms were developed to predict health information understandability for international college students aged 25-30 years. Thirteen natural language features and 5 evaluation dimensions were identified and compared in terms of their impact on the performance of the models. Health information understandability varies according to the demographic profiles of the target readers, and for international tertiary students, improving health information evidentness, relevance, and logic is critical.
Keywords: Health education
Machine learning
Understandability evaluation
Publisher: JMIR Publications
Journal: JMIR medical informatics 
EISSN: 2291-9694
DOI: 10.2196/28413
Rights: ¬©Meng Ji, Yanmeng Liu, Mengdan Zhao, Ziqing Lyu, Boren Zhang, Xin Luo, Yanlin Li, Yin Zhong. Originally published in JMIR Medical Informatics (, 06.05.2021. This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.
The following publication Ji M, Liu Y, Zhao M, Lyu Z, Zhang B, Luo X, Li Y, Zhong YUse of Machine Learning Algorithms to Predict the Understandability of Health Education Materials: Development and Evaluation Study JMIR Med Inform 2021;9(5):e28413 is available at
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Ji_Use_machine_learning.pdf667.1 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

Last Week
Last month
Citations as of May 28, 2023


Citations as of May 28, 2023


Citations as of May 25, 2023


Citations as of May 25, 2023

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.