DOI QR코드

DOI QR Code

Developing data quality management algorithm for Hypertension Patients accompanied with Diabetes Mellitus By Data Mining

데이터 마이닝을 이용한 고혈압환자의 당뇨질환 동반에 관한 데이터 질 관리 알고리즘 개발

  • Received : 2016.06.01
  • Accepted : 2016.07.20
  • Published : 2016.07.28

Abstract

There is a need to develop a data quality management algorithm in order to improve the quality of health care data. In this study, we developed a data quality control algorithms associated diseases related to diabetes in patients with hypertension. To make a data quality algorithm, we extracted hypertension patients from 2011 and 2012 discharge damage survey data. As the result of developing Data quality management algorithm, significant factors in hypertension patients with diabetes are gender, age, Glomerular disorders in diabetes mellitus, Diabetic retinopathy, Diabetic polyneuropathy, Closed [percutaneous] [needle] biopsy of kidney. Depending on the decision tree results, we defined Outlier which was probability values associated with a patient having diabetes corporal with hypertension or more than 80%, or not more than 20%, and found six groups with extreme values for diabetes accompanying hypertension patients. Thus there is a need to check the actual data contained in the Outlier(extreme value) groups to improve the quality of the data.

보건의료데이터의 질적 수준을 향상시키기 위해서는 데이터 질 관리 알고리즘을 개발할 필요성이 있다. 이에 본 연구에서는 질환의 유병률, 입원율이 높은 고혈압 환자의 당뇨질환 동반에 관련된 데이터 질 관리 알고리즘을 개발하고자 하였다. 이를 위해 2011년, 2012년 퇴원손상심층조사 자료 중 고혈압 환자 61,199건을 추출하여 분석대상으로 하였다. 데이터 마이닝의 대화식 의사결정나무 방법과 Outlier Detection 방법론을 통해 데이터 질 관리 알고리즘 개발한 결과 고혈압 환자가 당뇨병을 동반상병으로 가지는데 영향을 미치는 요인으로는 성별, 연령, 당뇨병성 사구체 장애, 당뇨병성 망막병증, 당병성 다발성 신경병증 등이 있었다. 의사결정나무 결과에 따라 당뇨병을 동반상병으로 가질 확률 값이 80% 이상이거나, 20% 이하인 집단을 Outlier(극단치)로 정의하고, 고혈압 환자의 당뇨 동반에 대한 극단치를 가지는 6개 집단을 발견하였다. 이와 같이 Outlier(극단치) 집단에 포함되는 실제 데이터를 확인하여 데이터의 질적 수준을 향상 시킬 필요가 있다.

Keywords

Acknowledgement

Grant : 융복합보건의료기술

Supported by : 한국보건산업진흥원

References

  1. Health Insurance Review & Assessment Service, "Survey results of medical information status", 14p, 2014.
  2. Yoomi Kim, Ilsoo Park, Misook Kwak, Misun Kim, Yae-En Kim. "Health Information Management", chapter 8, Secondary Data Sources, 2014.
  3. Korea Database Agency, "2010 Data quality management maturity level research report", p13, 2010.
  4. Insook cho, "Assessing the Quality of Structured Data Entry for the Secondary Use of Electronic Medical Records", Med Informatics, Vol. 15, No. 4, pp.423-431, 2009.
  5. Medical Record Insitute, "HEALTHCARE DOCUMENTATION: A REPORT ON INFORMATION CAPTURE AND REPORT GENERATION", pp,11-14, 2002.
  6. Nicole Lewis, informationweek Health care connecring the healthcare thechnology community, http://www.informationweek.com/healthcare/clinical-information-systems/poor-data-management-costs-healthcare-providers/d/d-id/1105481?, July 24, 2012.
  7. Juliano, "A Systemic Review Of Outlier Detection Techniques In Medical Data: Preminary Data", 2011.
  8. K.Suganya, S.Dhamodharan, "Assessment of Data Quality in Health Care Using Association Rules", International Journal of Engineering and Advanced Technology, Vol 3, No 4, pp.36-37, 2014.
  9. S. preetha, V. Radha, "Enhanced Outlier Detection Method Using Association Rule Mining Technique", International Journal of Computer Applications, Vol 42. No.7, 2012.
  10. National Health Insurance Corporation, Benefits by Classification of 298 Disease Categories, 2014.
  11. Korea Centers for Disease Control and Prevention , Discharge damage depth investigation 2011, 2012.
  12. Yoomi Kim, Daegon Cho, Sungok Hong, Eunju Kim, Sunghong Kang, "Analysis on Geographical Variations of the Prevalence of Hypertension Using Multi-year Data", The Korean Geographical Society, vol 49, No.6, pp. 935-948, 2014.
  13. Ankerst, M., Elsen, C., Ester, M., Kriegel, H.P., "Visual Classication: An Interactive Approach to Decision Tree Construction", KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and datamining, pp. 392-396, 1999.
  14. Mi-Jin Kim, Yoon-Sik Yoo, "A Study on the Application Methods of Big data in the Healthcare Field", 2015.
  15. Eun-Young Jung, Byoung-Hui Jeong, Eun-Sil Yoon, Dong-Jin Kim, Yoon-Young Park, Dong-Kyun Park, "Personalized diet and exercise management service based on PHR", Journal of The Korea Society of Computer and Information, Vol. 17 No. 9, pp.113-125. 2012.
  16. Statistics Korea, Korean standard classification of Disease 2010, 2010.
  17. Statistics Korea, Korean standard classification of Disease Vol 2 Instruction Manual 2010, 2010.
  18. Statistics Korea, Korean standard classification of Disease Vol 3 Index 2010, 2010.
  19. Young-Jun Kim, "Convergence of Business Information System Process using Knowledge-based Method", Journal of the Korea Convergence Society, Vol. 6, No. 4, pp. 65-71, 2015. https://doi.org/10.15207/JKCS.2015.6.4.065
  20. yong-won kim, "A study on Convergent & Adaptive Quality Analysis using DQnA model", Journal of the Korea Convergence Society, Vol. 5, No. 4, pp. 21-25, 2014. https://doi.org/10.15207/JKCS.2014.5.4.021