Recently, MedBench, a domestic authoritative medical large model evaluation platform, released its latest evaluation list. The DeepBlue-MR-v1 medical large model from DeepBlue Technology not only topped the list in complex medical reasoning evaluation but also secured the top spot with a high score of 94.2 in multiple comprehensive evaluations.
The list shows that the large models participating in this evaluation include Ali Ant Large Model, Tencent YouTu TianYan Medical Large Model, Runyi Medical Large Model developed based on Huawei Pangu Large Model, and Yunzhisheng UniGPT-Med-U1 Large Model, among others.
MedBench is the first authoritative evaluation platform for Chinese medical models, established by Shanghai Artificial Intelligence Laboratory and Shanghai Digital Medical Innovation Center in collaboration with multiple domestic medical institutions and research units. The platform gathers the expert experience and knowledge reserves of top medical institutions, and has provided evaluations for over 387 models worldwide. Leading companies such as Huawei and Baidu consider it a technical validation threshold, and some hospitals even include evaluation results in procurement references. At the same time, its evaluation system has been included in the AI Class III certification application support category of the National Medical Products Administration and published in journals in the first district of the Chinese Academy of Sciences, forming a closed-loop ecology of "industry university research evaluation". At the international level, MedBench's vertical depth is comparable to internationally renowned evaluation systems such as MIMIC-CXR, but it is more suitable for the needs of Chinese medical scenarios and has become an important reference system for the global medical AI track.
DeepBlue MR-v1 Medical Reasoning Model is a medical reasoning model independently developed by DeepBlue Technology, specializing in tasks such as clinical medical consultation, auxiliary medical diagnosis, and developing diagnosis and treatment plans. By cleaning, constructing, and annotating massive amounts of data such as medical textbooks, diagnosis and treatment guidelines, expert papers, medical records, medical reasoning, medical terminology, and psychological counseling, and using a self-developed training system, a dense large language model based on Transformer architecture was developed to align human medical reasoning abilities. The DeepBlue-MR-v1 medical reasoning model is pre trained on massive and high-quality medical data to construct a semantic space for medical reasoning. Then, through post training supervision fine-tuning, medical reasoning instruction enhancement, and multi-stage adaptive reinforcement learning algorithms, the medical reasoning ability is iteratively improved.
Medical reasoning ability is the pearl on the crown of AI healthcare. DeepBlue MR-v1 medical big model from DeepBlue Technology has consistently topped the MedBench complex medical reasoning ability chart and expanded its leading advantage. While maintaining the absolute leading advantage in complex medical reasoning ability, the comprehensive scores in five dimensions including medical language understanding, medical language generation, medical knowledge Q&A, complex medical reasoning, medical safety, and ethics also topped MedBench, fully demonstrating the industry-leading technological strength.
Shenlan AI Diagnosis Assistant has landed in several top tier hospitals in Hubei. Currently, based on the DeepBlue-MR-v1 medical model, the AI medical products developed by Shenlan Technology have formed an intelligent agent product matrix covering "AI Diagnosis Assistant", "Remote Video Diagnosis", "Auxiliary Diagnosis System" and "Medical Expert Knowledge Base". Cooperating with multiple medical institutions such as Wuhan Central Hospital, Wuhan Union Medical College Hospital, Wuhan Blood Center, Wuhan Jingwei Center, and Wuhan Wudong Hospital, we will jointly promote the deep application of AI technology in scenarios such as consultation, diagnosis, and specialized services
。