DOI QR코드

DOI QR Code

생성형 거대언어모델의 의학 적용 현황과 방향 - 동아시아 의학을 중심으로 -

Current Status and Direction of Generative Large Language Model Applications in Medicine - Focusing on East Asian Medicine -

  • 강봉수 (가천대학교 한의과대학 생리학교실) ;
  • 이상연 (고려대학교 생명과학대학 환경생태공학부) ;
  • 배효진 (서울대학교 의과대학 생리학교실) ;
  • 김창업 (가천대학교 한의과대학 생리학교실)
  • Bongsu Kang (Department of Physiology, College of Korean Medicine, Gachon University) ;
  • SangYeon Lee (Division of Environmental Science & Ecological Engineering, College of Life Science and Biotechnology, Korea University) ;
  • Hyojin Bae (Department of Physiology, Seoul National University College of Medicine) ;
  • Chang-Eop Kim (Department of Physiology, College of Korean Medicine, Gachon University)
  • 투고 : 2024.02.23
  • 심사 : 2024.04.23
  • 발행 : 2024.04.25

초록

The rapid advancement of generative large language models has revolutionized various real-life domains, emphasizing the importance of exploring their applications in healthcare. This study aims to examine how generative large language models are implemented in the medical domain, with the specific objective of searching for the possibility and potential of integration between generative large language models and East Asian medicine. Through a comprehensive current state analysis, we identified limitations in the deployment of generative large language models within East Asian medicine and proposed directions for future research. Our findings highlight the essential need for accumulating and generating structured data to improve the capabilities of generative large language models in East Asian medicine. Additionally, we tackle the issue of hallucination and the necessity for a robust model evaluation framework. Despite these challenges, the application of generative large language models in East Asian medicine has demonstrated promising results. Techniques such as model augmentation, multimodal structures, and knowledge distillation have the potential to significantly enhance accuracy, efficiency, and accessibility. In conclusion, we expect generative large language models to play a pivotal role in facilitating precise diagnostics, personalized treatment in clinical fields, and fostering innovation in education and research within East Asian medicine.

키워드

과제정보

이 논문은 2022년도 가천대학교 교내연구비 지원에 의한 결과임.(GCU-202206720001)

참고문헌

  1. Min B, Ross H, Sulem E, Veyseh APB, Nguyen TH, Sainz O, et al. Recent Advances in Natural Language Processing via Large Pre-trained Language Models: A Survey. ACM Comput Surv. 2023;56(2):Article 30.
  2. Minaee S, Mikolov T, Nikzad N, Chenaghlu M, Socher R, Amatriain X, et al. Large Language Models: A Survey. arXiv preprint arXiv:2402.06196. 2024.
  3. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Advances in neural information processing systems. 2017;30.
  4. Devlin J, Chang M-W, Lee K, Toutanova K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:181004805. 2018.
  5. Fan L, Li L, Ma Z, Lee S, Yu H, Hemphill L. A bibliometric review of large language models research from 2017 to 2023. arXiv preprint arXiv:230402020. 2023.
  6. Gomez Cano CA, Sanchez Castillo V, Clavijo Gallego TA. Unveiling the Thematic Landscape of Generative Pre-trained Transformer (GPT) Through Bibliometric Analysis. Metaverse Basic and Applied Research. 2023;2:33.
  7. Li J, Dada A, Puladi B, Kleesiek J, Egger J. ChatGPT in healthcare: A taxonomy and systematic review. Computer Methods and Programs in Biomedicine. 2024;245:108013.
  8. Kasneci E, Sessler K, Kuchemann S, Bannert M, Dementieva D, Fischer F, et al. ChatGPT for good? On opportunities and challenges of large language models for education. Learning and Individual Differences. 2023;103:102274.
  9. Brynjolfsson E, Li D, Raymond LR. Generative AI at work. National Bureau of Economic Research; 2023.
  10. Huang L, Yu W, Ma W, Zhong W, Feng Z, Wang H, et al. A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions. arXiv preprint arXiv:231105232. 2023.
  11. Guo Y, Shi H, Kumar A, Grauman K, Rosing T, Feris R, editors. Spottune: transfer learning through adaptive fine-tuning. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition; 2019.
  12. Lewis P, Perez E, Piktus A, Petroni F, Karpukhin V, Goyal N, et al. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems. 2020;33:9459-74.
  13. Lee J, Yoon W, Kim S, Kim D, Kim S, So CH, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics. 2019;36(4):1234-40.
  14. Roy A, Pan S, editors. Incorporating medical knowledge in BERT for clinical relation extraction. Proceedings of the 2021 conference on empirical methods in natural language processing; 2021.
  15. Chen Y-P, Lo Y-H, Lai F, Huang C-H. Disease Concept-Embedding Based on the Self-Supervised Method for Medical Information Extraction from Electronic Health Records and Disease Retrieval: Algorithm Development and Validation Study. J Med Internet Res. 2021;23(1):e25113.
  16. Kottlors J, Bratke G, Rauen P, Kabbasch C, Persigehl T, Schlamann M, et al. Feasibility of Differential Diagnosis Based on Imaging Patterns Using a Large Language Model. Radiology. 2023;308(1):e231167.
  17. Kuckelman IJ, Yi PH, Bui M, Onuh I, Anderson JA, Ross AB. Assessing AI-Powered Patient Education: A Case Study in Radiology. Academic Radiology. 2024;31(1):338-42.
  18. Mbakwe AB, Lourentzou I, Celi LA, Mechanic OJ, Dagan A. ChatGPT passing USMLE shines a spotlight on the flaws of medical education. Public Library of Science San Francisco, CA USA; 2023. p. e0000205.
  19. Wang H, Wu W, Dou Z, He L, Yang L. Performance and exploration of ChatGPT in medical examination, records and education in Chinese: Pave the way for medical AI. International Journal of Medical Informatics. 2023;177:105173.
  20. Kasai J, Kasai Y, Sakaguchi K, Yamada Y, Radev D. Evaluating gpt-4 and chatgpt on japanese medical licensing examinations. arXiv preprint arXiv:230318027. 2023.
  21. Ferdush J, Begum M, Hossain ST. ChatGPT and Clinical Decision Support: Scope, Application, and Limitations. Annals of Biomedical Engineering. 2023.
  22. Holderried F, Stegemann-Philipps C, Herschbach L, Moldt J-A, Nevins A, Griewatz J, et al. A Generative Pretrained Transformer (GPT)-Powered Chatbot as a Simulated Patient to Practice History Taking: Prospective, Mixed Methods Study. JMIR Medical Education. 2024;10(1):e53961.
  23. Zhang M, Yang Z, Liu C, Fang L, editors. Traditional Chinese medicine knowledge service based on semi-supervised BERT-BiLSTM-CRF model. 2020 International Conference on Service Science (ICSS); 2020: IEEE.
  24. Mucheng R, Heyan H, Yuxiang Z, Qianwen C, Yuan B, Yang G, editors. TCM-SD: A Benchmark for Probing Syndrome Differentiation via Natural Language Processing2022 October; Nanchang, China: Chinese Information Processing Society of China.
  25. Oh J. A Strategy for Disassembling the Traditional East Asian Medicine Herbal Formulas With Machine Learning. Journal of Oriental Medical Classics. 2023;36(2):23-34.
  26. Oh J. Comparison of Word Extraction Methods Based on Unsupervised Learning for Analyzing East Asian Traditional Medicine Texts. Journal of Oriental Medical Classics. 2019;32(3):47-57.
  27. Agrawal M, Hegselmann S, Lang H, Kim Y, Sontag D. Large language models are zero-shot clinical information extractors. arXiv preprint arXiv:220512689. 2022.
  28. Guo E, Gupta M, Sinha S, Rossler K, Tatagiba M, Akagami R, et al. neuroGPT-X: toward a clinic-ready large language model. Journal of Neurosurgery. 2023:1-13.
  29. Sivarajkumar S, Kelley M, Samolyk-Mazzanti A, Visweswaran S, Wang Y. An empirical evaluation of prompting strategies for large language models in zero-shot clinical natural language processing. arXiv preprint arXiv:230908008. 2023.
  30. Kartchner D, Ramalingam S, Al-Hussaini I, Kronick O, Mitchell C, editors. Zero-Shot Information Extraction for Clinical Meta-Analysis using Large Language Models2023 July; Toronto, Canada: Association for Computational Linguistics.
  31. Zhang J, Sun K, Jagadeesh A, Ghahfarokhi M, Gupta D, Gupta A, et al. The Potential and Pitfalls of using a Large Language Model such as ChatGPT or GPT-4 as a Clinical Assistant. arXiv preprint arXiv:230708152. 2023.
  32. Holmes J, Liu Z, Zhang L, Ding Y, Sio TT, McGee LA, et al. Evaluating large language models on a highly-specialized topic, radiation oncology physics. arXiv preprint arXiv:230401938. 2023.
  33. Krusche M, Callhoff J, Knitza J, Ruffer N. Diagnostic accuracy of a large language model in rheumatology: comparison of physician and ChatGPT-4. Rheumatology International. 2024;44(2):303-6.
  34. Liu, Zhengliang, et al. "Deid-gpt: Zero-shot medical text de-identification by gpt-4." arXiv preprint arXiv:2303.11032 (2023).
  35. Zhang K, Yu J, Yan Z, Liu Y, Adhikarla E, Fu S, et al. BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks. arXiv preprint arXiv:230517100. 2023.
  36. Wang G, Yang G, Du Z, Fan L, Li X. ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation. arXiv preprint arXiv:230609968. 2023.
  37. Yunxiang L, Zihan L, Kai Z, Ruilong D, You Z. Chatdoctor: A medical chat model fine-tuned on llama model using medical domain knowledge. arXiv preprint arXiv:230314070. 2023.
  38. Chen Z, Cano AH, Romanou A, Bonnet A, Matoba K, Salvi F, et al. Meditron-70b: Scaling medical pretraining for large language models. arXiv preprint arXiv:231116079. 2023.
  39. Wu C, Zhang X, Zhang Y, Wang Y, Xie W. Pmc-llama: Further finetuning llama on medical papers. arXiv preprint arXiv:230414454. 2023.
  40. Zakka C, Shad R, Chaurasia A, Dalal AR, Kim JL, Moor M, et al. Almanac - Retrieval-Augmented Language Models for Clinical Medicine. NEJM AI. 2024;1(2):AIoa2300068.
  41. Singhal K, Azizi S, Tu T, Mahdavi SS, Wei J, Chung HW, et al. Large language models encode clinical knowledge. Nature. 2023;620(7972):172-80.
  42. Han T, Adams LC, Papaioannou J-M, Grundmann P, Oberhauser T, Loser A, et al. MedAlpaca--An Open-Source Collection of Medical Conversational AI Models and Training Data. arXiv preprint arXiv:230408247. 2023.
  43. Shi, Yucheng, et al. "Mededit: Model editing for medical question answering with external knowledge bases." arXiv preprint arXiv:2309.16035 (2023).
  44. Niu S, Ma J, Bai L, Wang Z, Guo L, Yang X. EHR-KnowGen: Knowledge-enhanced multimodal learning for disease diagnosis generation. Information Fusion. 2024;102:102069.
  45. Yang J, Liu C, Deng W, Wu D, Weng C, Zhou Y, Wang K. Enhancing phenotype recognition in clinical notes using large language models: PhenoBCBERT and PhenoGPT. Patterns. 2024;5(1):100887.
  46. Ghosh A, Acharya A, Jain R, Saha S, Chadha A, Sinha S. Clipsyntel: Clip and llm synergy for multimodal question summarization in healthcare. arXiv preprint arXiv:231211541. 2023.
  47. Wang H, Gao C, Dantona C, Hull B, Sun J. DRG-LLaMA : tuning LLaMA model to predict diagnosis-related group for hospitalized patients. npj Digital Medicine. 2024;7(1):16.
  48. Jin M, Yu Q, Zhang C, Shu D, Zhu S, Du M, et al. Health-LLM: Personalized Retrieval-Augmented Disease Prediction Model. arXiv preprint arXiv:240200746. 2024.
  49. Zhou J, He X, Sun L, Xu J, Chen X, Chu Y, et al. Pre-trained Multimodal Large Language Model Enhances Dermatological Diagnosis using SkinGPT-4. medRxiv. 2023:2023.06.10.23291127.
  50. Tu T, Palepu A, Schaekermann M, Saab K, Freyberg J, Tanno R, et al. Towards conversational diagnostic ai. arXiv preprint arXiv:240105654. 2024.
  51. Liu Z, Wang P, Li Y, Holmes J, Shu P, Zhang L, et al. Radonc-gpt: A large language model for radiation oncology. arXiv preprint arXiv:230910160. 2023.
  52. Yizhen L, Shaohan H, Jiaxing Q, Lei Q, Dongran H, Zhongzhi L. Exploring the Comprehension of ChatGPT in Traditional Chinese Medicine Knowledge. arXiv preprint arXiv:240309164. 2024.
  53. Hsu H-Y, Hsu K-C, Hou S-Y, Wu C-L, Hsieh Y-W, Cheng Y-D. Examining Real-World Medication Consultations and Drug-Herb Interactions: ChatGPT Performance Evaluation. JMIR Med Educ. 2023;9:e48433.
  54. Lee H. Using ChatGPT as a Learning Tool in Acupuncture Education: Comparative Study. JMIR Med Educ. 2023;9:e47427.
  55. Chen Q, Ni J, Xu J, Gao X, Xia L. Generation of traditional Chinese medicine prescription driven by generative artificial intelligence GPT-4. China Pharmacy. 2023;34(23):2825-8.
  56. Jang D, Yun T-R, Lee C-Y, Kwon Y-K, Kim C-E. GPT-4 can pass the Korean National Licensing Examination for Korean Medicine Doctors. PLOS Digital Health. 2023;2(12):e0000416.
  57. Kim T-H, Kang JW, Lee MS. AI Chat bot - ChatGPT-4: A new opportunity and challenges in complementary and alternative medicine. Integrative Medicine Research. 2023;12(3):100977.
  58. Liu J, Zhou P, Hua Y, Chong D, Tian Z, Liu A, et al. Benchmarking Large Language Models on CMExam-A Comprehensive Chinese Medical Exam Dataset. Advances in Neural Information Processing Systems. 2024;36.
  59. Zhu L, Mou W, Lai Y, Lin J, Luo P. Language and cultural bias in AI: comparing the performance of large language models developed in different countries on Traditional Chinese Medicine highlights the need for localized models. Journal of Translational Medicine. 2024;22(1):319.
  60. Zhang Y, Hao Y. Traditional Chinese Medicine Knowledge Graph Construction Based on Large Language Models. Electronics. 2024;13(7):1395.
  61. Park S-Y, Kim C-E. Enhancing Korean Medicine Education with Large Language Models: Focusing on the Development of Educational Artificial Intelligence. Journal of Physiology & Pathology in Korean Medicine. 2023;37(5):134-8.
  62. Li M, Zheng X. Identification of Ancient Chinese Medical Prescriptions and Case Data Analysis Under Artificial Intelligence GPT Algorithm: A Case Study of Song Dynasty Medical Literature. IEEE Access. 2023;11:131453-64.
  63. Zhu J, Gong Q, Zhou C, Luan H. ZhongJing: A Locally Deployed Large Language Model for Traditional Chinese Medicine and Corresponding Evaluation Methodology: A Large Language Model for data fine-tuning in the field of Traditional Chinese Medicine, and a new evaluation method called TCMEval are proposed. Proceedings of the 2023 4th International Symposium on Artificial Intelligence for Medicine Science. 2024;1036-42.
  64. Zhang H, Wang X, Meng Z, Jia Y, Xu D. Qibo: A Large Language Model for Traditional Chinese Medicine. arXiv preprint arXiv:240316056. 2024.
  65. Kang B, Kim J, Yun T-R, Kim C-E. Prompt-RAG: Pioneering Vector Embedding-Free Retrieval-Augmented Generation in Niche Domains, Exemplified by Korean Medicine. arXiv preprint arXiv:240111246. 2024.
  66. Yang G, Shi J, Wang Z, Liu X, Wang G. TCM-GPT: Efficient Pre-training of Large Language Models for Domain Adaptation in Traditional Chinese Medicine. arXiv preprint arXiv:231101786. 2023.
  67. Tan Y, Zhang Z, Li M, Pan F, Duan H, Huang Z, et al. MedChatZH: A tuning LLM for traditional Chinese medicine consultations. Computers in Biology and Medicine. 2024;172:108290.
  68. Yang T, Wang X-Y, Zhu Y, Hu K-F, Zhu X-F. Research Ideas and Methods of Intelligent Diagnosis and Treatment of Traditional Chinese Medicine Driven by Large Language Model. Journal of Nanjing University of Traditional Chinese Medicine. 2023;39(10):967-71.
  69. Pu H, Mi J, Lu S, He J, editors. RoKEPG: RoBERTa and Knowledge Enhancement for Prescription Generation of Traditional Chinese Medicine. 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 2023: IEEE.
  70. Wang X, Yang T, Hu K. Research on personalized prescription recommendation of traditional Chinese medicine based on large language pre-training model. Chinese Archives of Traditional Chinese Medicine. 1-14.
  71. Li Y, Peng X, Li J, Zuo X, Peng S, Pei D, et al. Relation Extraction Using Large Language Models: A Case Study on Acupuncture Point Locations. arXiv preprint arXiv:240405415. 2024.
  72. Zhou Z, Yang T, Hu K, editors. Traditional Chinese Medicine Epidemic Prevention and Treatment Question-Answering Model Based on LLMs. 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 2023 5-8 Dec. 2023.
  73. Liu C, Sun K, Zhou Q, Duan Y, Shu J, Kan H, et al. CPMI-ChatGLM: parameter-efficient fine-tuning ChatGLM with Chinese patent medicine instructions. Scientific Reports. 2024;14(1):6403.
  74. Wang Z, Li K, Ren Q, Yao K, Zhu Y, editors. Traditional Chinese Medicine Formula Classification Using Large Language Models. 2023 IEEE International Conference on Bioinformatics and Biomedicine (BIBM); 2023 5-8 Dec. 2023.
  75. Zhang J, Yang S, Liu J, Huang Q. AIGC Empowering the Revitalization of Traditional Chinese Medicine Ancient Books: A Study on the Construction of the Huang-Di Large Language Model. Library Tribune. 1-13.
  76. Tonmoy S, Zaman S, Jain V, Rani A, Rawte V, Chadha A, et al. A comprehensive survey of hallucination mitigation techniques in large language models. arXiv preprint arXiv:240101313. 2024.
  77. Zhang D, Yu Y, Li C, Dong J, Su D, Chu C, et al. Mm-llms: Recent advances in multimodal large language models. arXiv preprint arXiv:240113601. 2024.
  78. Gou J, Yu B, Maybank SJ, Tao D. Knowledge distillation: A survey. International Journal of Computer Vision. 2021;129:1789-819.
  79. Elbadawi M, Li H, Basit AW, Gaisford S. The role of artificial intelligence in generating original scientific research. International Journal of Pharmaceutics. 2024;652:123741.