DOI QR코드

DOI QR Code

Health Risk Management using Feature Extraction and Cluster Analysis considering Time Flow

시간흐름을 고려한 특징 추출과 군집 분석을 이용한 헬스 리스크 관리

  • Kang, Ji-Soo (Department of Computer Science, Kyonggi University) ;
  • Chung, Kyungyong (Division of Computer Science and Engineering, Kyonggi University) ;
  • Jung, Hoill (Division of Computer Information, Daelim University)
  • 강지수 (경기대학교 컴퓨터과학과) ;
  • 정경용 (경기대학교 컴퓨터공학부) ;
  • 정호일 (대림대학교 컴퓨터정보학부)
  • Received : 2020.12.08
  • Accepted : 2021.01.20
  • Published : 2021.01.28

Abstract

In this paper, we propose health risk management using feature extraction and cluster analysis considering time flow. The proposed method proceeds in three steps. The first is the pre-processing and feature extraction step. It collects user's lifelog using a wearable device, removes incomplete data, errors, noise, and contradictory data, and processes missing values. Then, for feature extraction, important variables are selected through principal component analysis, and data similar to the relationship between the data are classified through correlation coefficient and covariance. In order to analyze the features extracted from the lifelog, dynamic clustering is performed through the K-means algorithm in consideration of the passage of time. The new data is clustered through the similarity distance measurement method based on the increment of the sum of squared errors. Next is to extract information about the cluster by considering the passage of time. Therefore, using the health decision-making system through feature clusters, risks able to managed through factors such as physical characteristics, lifestyle habits, disease status, health care event occurrence risk, and predictability. The performance evaluation compares the proposed method using Precision, Recall, and F-measure with the fuzzy and kernel-based clustering. As a result of the evaluation, the proposed method is excellently evaluated. Therefore, through the proposed method, it is possible to accurately predict and appropriately manage the user's potential health risk by using the similarity with the patient.

본 논문에서는 시간 흐름을 고려한 특징추출과 군집분석을 이용한 헬스 리스크 관리를 제안한다. 제안하는 방법은 세단계로 진행한다. 첫 번째는 전처리 및 특징추출 단계이다. 이는 웨어러블 디바이스를 이용하여 라이프로그를 수집하여 불완전데이터, 에러, 잡음, 모순된 데이터를 제거하며 결측 값을 처리한다. 그 다음 특징추출을 위해 주성분 분석을 통해 중요 변수를 선택하고, 상관계수와 공분산을 통해 데이터 간의 관계와 유사한 데이터들의 분류를 진행한다. 또한 라이프로그에서 추출한 특징을 분석하기 위해 시간의 흐름을 고려하여 K-means 알고리즘을 통해 동적 군집을 진행한다. 새로운 데이터는 오차 제곱합의 증가분을 기반으로 유사성 거리 측정 방법을 통해 군집을 진행하고, 시간의 흐름을 고려하여 군집에 대한 정보를 추출한다. 따라서 특징 군집을 통해 헬스 의사결정 시스템을 이용하여 신체적 특성, 생활습관, 질병여부, 헬스케어 이벤트 발생위험, 예상 정도 등의 요소를 통해 리스크를 관리할 수 있다. 성능평가는 Precision, Recall, F-measure을 사용하여 제안하는 방법과 퍼지방법, 커널기반 방법을 비교한다. 평가결과 제안하는 방법이 우수하게 평가된다. 따라서 제안하는 방법을 통해 유병자와의 유사도를 이용하여 정확한 사용자의 잠재적 건강 위험을 예측 및 적절한 관리가 가능하다.

Keywords

References

  1. J. C. Kim & K. Chung. (2017). Depression Index Service Using Knowledge Based Crowdsourcing in Smart Health. Wireless Personal Communications, 93(1), 255-268. DOI : 10.1007/s11277-016-3923-3
  2. K. Chung & Y. Hyun. (2020). Edge Computing Health Model Using P2P-Based Deep Neural Networks. Peer-to-Peer Networking and Applications, 13(2), 694-703. DOI : 10.1007/s12083-019-00738-y
  3. H. I. Jung, H. Yoo & K. Chung. (2016). Associative Context Mining for Ontology-driven Hidden Knowledge Discovery. Cluster Computing, 19(4), 2261-2271. DOI : 10.1007/s10586-016-0672-8
  4. J. W. Baek & K. Chung. (2020). Context Deep Neural Network Model for Predicting Depression Risk Using Multiple Regression. IEEE Access, 8(1), 18171-18181. DOI : 10.1109/ACCESS.2020.2968393
  5. C. M. Kim, E. J. Hong, K. Chung & R. C. Park. (2020). Driver Facial Expression Analysis Using LFA-CRNN-Based Feature Extraction for Health-Risk Decisions. Applied Sciences, 10(8), 2956. DOI : 10.3390/app10082956
  6. J. W. Baek, J. C. Kim, J. Chun & K. Chung (2019) Hybrid Clustering based Health Decision-making for improving Dietary Habits, Technology and Health Care, 27(5), 459-472. DOI : 10.3233/THC-191730
  7. K. Chung & H. Jung. (2020). Knowledge-based dynamic cluster model for healthcare management using a convolutional neural network. Information Technology and Management, 21(1), 41-50. DOI : 10.1007/s10799-019-00304-1
  8. H. I. Jung & K. Chung. (2015). Ontology-driven Slope Modeling for Disaster Management Service. Cluster Computing, 18(2), 677-692. DOI : 10.1007/s10586-015-0424-1
  9. Korea Health Statistics 2018: Korea National Health and Nutrition Examination Survey (KNHANES VII-3), 2017, Korea Centers for Disease Control and Prevention
  10. Health Insurance Review & Assessment Service, http://opendata.hira.or.kr.
  11. J. S. Kang, D. H. Shin, J. W. Baek & K. Chung. (2019). Activity Recommendation Model Using Rank Correlation for Chronic Stress Management. Applied Sciences, 9(20), 4284. DOI : 10.3390/app9204284
  12. H. Jung & K. Chung. (2016). Life Style Improvement Mobile Service for High Risk Chronic Disease based on PHR Platform. Cluster Computing, 19(2), 967-977 DOI : 10.1007/s10586-016-0549-x
  13. J. S. Kang & J. W. Baek, K. Chung. (2020). PrefixSpan Based Pattern Mining Using Time Sliding Weight From Streaming Data. IEEE Access, 8, 124833-124844. DOI : 10.1109/access.2020.3007485
  14. D. Chicco & G. Jurman. (2020). The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC genomics, 21(1), 6. DOI : 10.1186/s12864-019-6413-7
  15. Z. Zivkovic. (2004). Improved adaptive Gaussian mixture model for background subtraction. In Proc of the 17th International Conference on Pattern Recognition, ICPR, 2, 28-31. DOI : 10.1109/icpr.2004.1333992
  16. M. Tang, D. Marin, I. B. Ayed & Y. Boykov. (2019). Kernel cuts: Kernel and spectral clustering meet regularization. International Journal of Computer Vision, 127(5), 477-511. DOI : 10.1007/s11263-018-1115-1
  17. D. H. Shin, R. C. Park & K. Chung. (2020). Decision Boundary-Based Anomaly Detection Model Using Improved AnoGAN From ECG Data, IEEE Access, 8, 108664-108674. DOI : 10.1109/access.2020.3000638