Challenges and solutions for the development of medical GPTs-ZENTIME PUBLISHING CORPORATION LIMITED

Home Metaverse in Medicine All issues 专家述评/Commentary

医学GPT研发的挑战与解决方案

Challenges and solutions for the development of medical GPTs

白春学^{1, 2, 3, 4*}

1. 复旦大学附属中山医院呼吸与危重症医学科，上海 200032

2. 上海市呼吸物联网医学工程技术研究中心，上海 200032

3. 上海市呼吸病研究所，上海 200032

4. 复旦大学附属中山医院 AI+肺癌防治中心，上海 200032

［作者简介］白春学，博士，主任医师、教授. E-mail： bai.chunxue@zs-hospital.sh.cn

* 通信作者（Corresponding author）

［基金项目］四大慢病重大专项（2024ZD0529300）.

［收稿日期］ 2026-01-05 ［接受日期］ 2026-03-01 ［发表日期］ 2026-03-30

伦理声明 无。

利益冲突 所有作者声明不存在利益冲突。

作者贡献 白春学：选题、撰写、定稿。

DOI: https://doi.org/10.61189/799037wwkyrc

Abstract

系统梳理医学GPT研发面临的关键挑战，结合国际最新综述、评价框架、伦理治理与监管文件，以及白春学教授《肺结节专家—BAIMGPT白皮书》，总结医学GPT从通用大模型走向临床可用系统的主要解决路径。基于近年发表于高影响力医学、数字医学与人工智能期刊的系统综述、方法学研究和真实工作流评价，同时结合WHO伦理治理文件与FDA人工智能器械生命周期建议，从事实可靠性、知识更新、数据治理、多模态整合、工作流适配、可解释性、偏倚公平性与责任边界等方面，对医学GPT研发的核心问题进行综合分析。现有证据显示，医学GPT的主要瓶颈并不只是一般意义上的“准确率不足”，而更集中体现为幻觉与事实错误风险高、对新增医学知识吸收有限、医学数据异质性强且标签稳定性不足、单一文本模型难以支撑真实临床中的多模态决策、离线高分与真实部署效果不一致、解释链与责任链不完整，以及偏倚、公平性和伦理治理问题突出。高质量研究进一步指出，当前LLM对信息顺序和信息量敏感，指令遵循性不足，尚未准备好承担自主临床决策。医学GPT研发的核心任务，不是单纯提升语言生成能力，而是将大模型重构为具备知识可靠性、流程适配性、可追溯性和制度可接受性的医学智能系统。现阶段医学GPT更适合作为知识增强工具和流程支持工具，而非替代临床主体的自动决策系统；未来应坚持“辅助决策而非替代决策”的基本边界，在强化数据治理、循证更新、真实世界验证和全生命周期治理的前提下稳步推进临床应用。

To systematically summarize the major challenges in developing medical GPT systems and, with reference to recent international reviews, evaluation frameworks, ethical and regulatory guidance, as well as Prof Chunxue Bai' s BAIMGPT White Paper, to outline practical solutions for translating large language models into clinically usable systems. Recent high-impact systematic reviews, methodological studies, real-world workflow evaluations, and governance guidance were synthesized to examine the main issues in medical GPT development, including factual reliability, knowledge updating, data governance, multimodal integration, workflow adaptation, explainability, bias, fairness, and accountability. Current evidence indicates that the bottlenecks of medical GPT go well beyond imperfect accuracy. Major challenges include hallucinations and factual inconsistency, limited ability to absorb newly updated medical knowledge, heterogeneous clinical data and unstable labels, insufficient support for multimodal decision-making, weak adaptation to real-world workflows, incomplete explainability and accountability, and concerns regarding bias, fairness, and ethics. Current LLMs remain sensitive to information order and quantity and are not ready for autonomous clinical decision-making. The mission of medical GPT development is not simply to improve language generation, but to transform large models into trustworthy medical intelligence systems with reliable knowledge, workflow compatibility, traceability, and governance readiness. At present, medical GPT should be positioned as a tool for cognitive augmentation and workflow support rather than a substitute for clinical judgment.

Keywords: 医学 GPT；大语言模型；临床决策支持；检索增强生成；数据治理；人机协同；专病智能体；BAIMGPT / medical GPT; large language model; clinical decision support; retrieval–augmented generation; data governance; human–AI collaboration; disease–specific agent; BAIMGPT

Download

Cite

Views

622

Downloads

Lastest Issue

CONTENTS IN BRIEF CONTENTS IN BRIEF Hospital at home: from international evidence to a China-oriented pathway—building an integrated hospital–community–home model for respiratory home hospitalization coordinated with cloud outpatient care, with evaluation of clinical safety, environment Challenges and solutions for the development of medical GPTs Liquid life and digital fence: governance dilemma and paradigm reconstruction of metaverse health data AI-enabled digital pathology and molecular testing A metaverse platform ("Huisheng Intelligent Education") for full-cycle medical education, training, and management: design rationale and early practice The metaverse revolution in neurological education: technical infrastructure, clinical applications, and educational efficacy Application of AI and multimodal fusion in the differential diagnosis of benign and malignant pulmonary nodules Deep learning-assisted fine assessment of pulmonary nodule images