Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/115813
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | School of Optometry | - |
| dc.contributor | Research Centre for SHARP Vision | - |
| dc.creator | Xu, P | - |
| dc.creator | Wu, Y | - |
| dc.creator | Jin, K | - |
| dc.creator | Chen, X | - |
| dc.creator | He, M | - |
| dc.creator | Shi, D | - |
| dc.date.accessioned | 2025-11-04T03:15:50Z | - |
| dc.date.available | 2025-11-04T03:15:50Z | - |
| dc.identifier.uri | http://hdl.handle.net/10397/115813 | - |
| dc.language.iso | en | en_US |
| dc.publisher | Elsevier Inc. | en_US |
| dc.rights | © 2025 The Author(s). Published by Elsevier Inc. on behalf of Zhejiang University Press. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). | en_US |
| dc.rights | The following publication Xu, P., Wu, Y., Jin, K., Chen, X., He, M., & Shi, D. (2025). DeepSeek-R1 outperforms Gemini 2.0 Pro, OpenAI o1, and o3-mini in bilingual complex ophthalmology reasoning. Advances in Ophthalmology Practice and Research, 5(3), 189–195 is available at https://doi.org/10.1016/j.aopr.2025.05.001. | en_US |
| dc.subject | Clinical decision support | en_US |
| dc.subject | DeepSeek | en_US |
| dc.subject | Gemini | en_US |
| dc.subject | Large language models | en_US |
| dc.subject | OpenAI | en_US |
| dc.subject | Ophthalmology professional examination | en_US |
| dc.subject | Reasoning ability | en_US |
| dc.title | DeepSeek-R1 outperforms Gemini 2.0 Pro, OpenAI o1, and o3-mini in bilingual complex ophthalmology reasoning | en_US |
| dc.type | Journal/Magazine Article | en_US |
| dc.identifier.spage | 189 | - |
| dc.identifier.epage | 195 | - |
| dc.identifier.volume | 5 | - |
| dc.identifier.issue | 3 | - |
| dc.identifier.doi | 10.1016/j.aopr.2025.05.001 | - |
| dcterms.abstract | Purpose: To evaluate the accuracy and reasoning ability of DeepSeek-R1 and three recently released large language models (LLMs) in bilingual complex ophthalmology cases. | - |
| dcterms.abstract | Methods: A total of 130 multiple-choice questions (MCQs) related to diagnosis (n = 39) and management (n = 91) were collected from the Chinese ophthalmology senior professional title examination and categorized into six topics. These MCQs were translated into English. Responses from DeepSeek-R1, Gemini 2.0 Pro, OpenAI o1 and o3-mini were generated under default configurations between February 15 and February 20, 2025. Accuracy was calculated as the proportion of correctly answered questions, with omissions and extra answers considered incorrect. Reasoning ability was evaluated through analyzing reasoning logic and the causes of reasoning errors. | - |
| dcterms.abstract | Results: DeepSeek-R1 demonstrated the highest overall accuracy, achieving 0.862 in Chinese MCQs and 0.808 in English MCQs. Gemini 2.0 Pro, OpenAI o1, and OpenAI o3-mini attained accuracies of 0.715, 0.685, and 0.692 in Chinese MCQs (all P <0.001 compared with DeepSeek-R1), and 0.746 (P = 0.115), 0.723 (P = 0.027), and 0.577 (P <0.001) in English MCQs, respectively. DeepSeek-R1 achieved the highest accuracy across five topics in both Chinese and English MCQs. It also excelled in management questions conducted in Chinese (all P <0.05). Reasoning ability analysis showed that the four LLMs shared similar reasoning logic. Ignoring key positive history, ignoring key positive signs, misinterpretation of medical data, and overuse of non–first-line interventions were the most common causes of reasoning errors. | - |
| dcterms.abstract | Conclusions: DeepSeek-R1 demonstrated superior performance in bilingual complex ophthalmology reasoning tasks than three state-of-the-art LLMs. These findings highlight the potential of advanced LLMs to assist in clinical decision-making and suggest a framework for evaluating reasoning capabilities. | - |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | Advances in ophthalmology practice and research, Aug.-Sept 2025, v. 5, no. 3, p. 189-195 | - |
| dcterms.isPartOf | Advances in ophthalmology practice and research | - |
| dcterms.issued | 2025-08 | - |
| dc.identifier.scopus | 2-s2.0-105009348145 | - |
| dc.identifier.eissn | 2667-3762 | - |
| dc.description.validate | 202511 bcch | - |
| dc.description.oa | Version of Record | en_US |
| dc.identifier.FolderNumber | OA_Scopus/WOS | en_US |
| dc.description.fundingSource | Others | en_US |
| dc.description.fundingText | This study was supported by the Global STEM Professorship Scheme (P0046113) and the Start-up Fund for RAPs under the Strategic Hiring Scheme (P0048623) from HKSAR. | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.description.oaCategory | CC | en_US |
| Appears in Collections: | Journal/Magazine Article | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 1-s2.0-S2667376225000290-main.pdf | 1.55 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



