• 제목/요약/키워드: Disease Prediction

검색결과 509건 처리시간 0.029초

COPD 코호트 자료에서의 Machine Learning 방법론 비교 (Comparison of Machine Learning Methodology in COPD Cohort Data)

  • 정현명;박헌진;이진국;이종민
    • 한국빅데이터학회지
    • /
    • 제2권2호
    • /
    • pp.115-128
    • /
    • 2017
  • 최근 머신러닝 방법은 높은 예측력과 함께 널리 이용되지만 머신러닝을 제대로 활용하기 위해서 데이터가 가진 한계를 통계적 기법으로 해결한다면 기존보다 더 높은 예측력을 이끌어 낼 수 있다. 본 연구에서는 Longitudinal and Imbalanced Data에서 SMOTE 방법을 활용하여 불균형 문제를 해결한 결과 예측력이 증가하는 것을 확인할 수 있었다. 추가적으로 만성폐쇄성폐질환 급성악화 관련 연구가 활발히 이루어지고 있지만 급성악화와 관련 있는 요인을 찾는 연구만 이루어지고 있어 여러 요인들에 대한 복합적인 관철과 예측모형을 통한 급성악화 예측 연구는 이루어지지 않는다. 본 연구에서는 여러 요인을 같이 살펴봤을 때 어떤 요인들이 만성폐쇄성폐질환 급성악화와 관련이 있는지 확인하고 개인 맞춤형 특정 질환 예측 모형을 구축하였다.

  • PDF

Application of Pharmacovigilance Methods in Occupational Health Surveillance: Comparison of Seven Disproportionality Metrics

  • Bonneterre, Vincent;Bicout, Dominique Joseph;De Gaudemaris, Regis
    • Safety and Health at Work
    • /
    • 제3권2호
    • /
    • pp.92-100
    • /
    • 2012
  • Objectives: The French National Occupational Diseases Surveillance and Prevention Network (RNV3P) is a French network of occupational disease specialists, which collects, in standardised coded reports, all cases where a physician of any specialty, referred a patient to a university occupational disease centre, to establish the relation between the disease observed and occupational exposures, independently of statutory considerations related to compensation. The objective is to compare the relevance of disproportionality measures, widely used in pharmacovigilance, for the detection of potentially new disease ${\times}$ exposure associations in RNV3P database (by analogy with the detection of potentially new health event ${\times}$ drug associations in the spontaneous reporting databases from pharmacovigilance). Methods: 2001-2009 data from RNV3P are used (81,132 observations leading to 11,627 disease ${\times}$ exposure associations). The structure of RNV3P database is compared with the ones of pharmacovigilance databases. Seven disproportionality metrics are tested and their results, notably in terms of ranking the disease ${\times}$ exposure associations, are compared. Results: RNV3P and pharmacovigilance databases showed similar structure. Frequentist methods (proportional reporting ratio [PRR], reporting odds ratio [ROR]) and a Bayesian one (known as BCPNN for "Bayesian Confidence Propagation Neural Network") show a rather similar behaviour on our data, conversely to other methods (as Poisson). Finally the PRR method was chosen, because more complex methods did not show a greater value with the RNV3P data. Accordingly, a procedure for detecting signals with PRR method, automatic triage for exclusion of associations already known, and then investigating these signals is suggested. Conclusion: This procedure may be seen as a first step of hypothesis generation before launching epidemiological and/or experimental studies.

Framingham Coronary Risk Score를 이용한 화병과 심혈관계 질환과의 관련성 연구 (Corelationship Study between Hwa-Byung and Coronary Heart Disease, by using Framingham Coronary Risk Score)

  • 정하룡;고상백;박종구;유준상;이재혁
    • 동의신경정신과학회지
    • /
    • 제22권3호
    • /
    • pp.13-22
    • /
    • 2011
  • Objectives : This study was to research the relationship between Hwa-Byung and Framingham coronary risk score(FRS), cardiovascular disease. Methods : 649 people participated in the community based cohort study in Wonju City of South Korea from July 2nd to August 30th in 2006. Educated investigators checked up systolic & diastolic blood pressure and surveyed Hwa-Byung Diagnostic Interview Schedule(HBDIS), cohort questionnaire about gender, age, smoking, diabetes. Blood sample was collected from participants to analyze total cholesterol, HDL-cholesterol. FRS was calculated from collected data. 10-year prediction of coronary heart disease was determined from FRS by using score sheet that is estimated by Wilson et al. Collected data were analyzed by the chi-square test. Results : 1. Low risk number of people was 18(52.9%) in Hwa-Byung group, 263(42.8%) in non Hwa-Byung group. p-value was 0.472. Difference of the two group was invalid. 2. The number of people below or equal to average 10-year prediction of coronary heart disease as gnder & age, Hwa-Byung group was 19(55.9%), non Hwa-Byung group was 412(67.0%). p-value was 0.251. Difference of the two group was invalid. Conclusions : There was no correlationship Between Hwa-Byung and 10-year prediction of coronary heart disease.

Factors Influencing Development and Severity of Grey Leaf Spot of Mulberry (Morus spp.)

  • Kumar, Punathil Meethal Pratheesh;Qadri, Syed Mashayak Hussaini;Pal, Susil Chandra
    • International Journal of Industrial Entomology and Biomaterials
    • /
    • 제22권1호
    • /
    • pp.11-15
    • /
    • 2011
  • Impact of pruning date, shoot age and weather parameters on the severity and development of grey leaf spot (Pseudocercospora mori) of mulberry was studied. The disease severity (%) increased with increase in shoot age irrespective of pruning date. Maximum disease severity was observed in plants pruned during second week of October and minimum in plants pruned during last week of December. Significant (P<0.05) influence of date of pruning, shoot age and their interaction was observed on the severity of the disease. Apparent infection rate (r) was significantly higher during plant growth period from day-48 to day-55. Average apparent rate was higher in plants pruned during first week of September and least in plants pruned during third and fourth week of December. Multiple regression analysis revealed contribution of various combinations of weather parameters on the disease severity. A linear prediction model [$Y=66.05+(-1.39)x_1+(-0.219)x_4$] with significant $R^2$ was developed for prediction of the disease under natural epiphytotic condition.

The Investigation of Employing Supervised Machine Learning Models to Predict Type 2 Diabetes Among Adults

  • Alhmiedat, Tareq;Alotaibi, Mohammed
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제16권9호
    • /
    • pp.2904-2926
    • /
    • 2022
  • Currently, diabetes is the most common chronic disease in the world, affecting 23.7% of the population in the Kingdom of Saudi Arabia. Diabetes may be the cause of lower-limb amputations, kidney failure and blindness among adults. Therefore, diagnosing the disease in its early stages is essential in order to save human lives. With the revolution in technology, Artificial Intelligence (AI) could play a central role in the early prediction of diabetes by employing Machine Learning (ML) technology. In this paper, we developed a diagnosis system using machine learning models for the detection of type 2 diabetes among adults, through the adoption of two different diabetes datasets: one for training and the other for the testing, to analyze and enhance the prediction accuracy. This work offers an enhanced classification accuracy as a result of employing several pre-processing methods before applying the ML models. According to the obtained results, the implemented Random Forest (RF) classifier offers the best classification accuracy with a classification score of 98.95%.

Disease Prediction Using Ranks of Gene Expressions

  • Kim, Ki-Yeol;Ki, Dong-Hyuk;Chung, Hyun-Cheol;Rha, Sun-Young
    • Genomics & Informatics
    • /
    • 제6권3호
    • /
    • pp.136-141
    • /
    • 2008
  • A large number of studies have been performed to identify biomarkers that will allow efficient detection and determination of the precise status of a patient’s disease. The use of microarrays to assess biomarker status is expected to improve prediction accuracies, because a whole-genome approach is used. Despite their potential, however, patient samples can differ with respect to biomarker status when analyzed on different platforms, making it more difficult to make accurate predictions, because bias may exist between any two different experimental conditions. Because of this difficulty in experimental standardization of microarray data, it is currently difficult to utilize microarray-based gene sets in the clinic. To address this problem, we propose a method that predicts disease status using gene expression data that are transformed by their ranks, a concept that is easily applied to two datasets that are obtained using different experimental platforms. NCI and colon cancer datasets, which were assessed using both Affymetrix and cDNA microarray platforms, were used for method validation. Our results demonstrate that the proposed method is able to achieve good predictive performance for datasets that are obtained under different experimental conditions.

심혈관질환 위험 예측을 위한 비용민감 학습 모델 (Cost-Sensitive Learning for Cardio-Cerebrovascular Disease Risk Prediction)

  • 이유나;이경희;조완섭
    • 한국빅데이터학회지
    • /
    • 제6권2호
    • /
    • pp.161-168
    • /
    • 2021
  • 본 연구에서는 기계 학습을 사용하여 심혈관 질환 예측 모델을 제안한다. 먼저 두 집단간에 다양한 차이를 다차원분석하고 그 결과를 시각화한다. 특히, 질환과 같이 정상집단과 환자집단 간에 높은 클래스 불균형이 존재하는 경우에 대하여 민감도를 향상시킬 수 있는 비용 민감 학습을 사용하는 예측 모델을 제안한다. 본 연구에서는 대표적인 머신러닝 기술인 CART와 XGBoost를 사용하여 예측모델을 개발하고, 심혈관 질환 환자 데이터를 대상으로 예측하고 성능을 비교한다. 연구결과에 따르면 CART가 XGBoost 보다 더 높은 정확도와 특이도를 보였으며, 정확도는 약 70%~74%로 나타났다.

전폐절제술시 폐관류스캔을 이용한 폐기능의 예측에 대한 평가 (Evaluation of the Predictive Pulmonary Function after Pneumonectomy Using Perfusion Lung Scan)

  • 김길동;정경영
    • Journal of Chest Surgery
    • /
    • 제28권4호
    • /
    • pp.371-375
    • /
    • 1995
  • Surgical resection of lung cancer or other disease is recently required in patients with severely impaired lung function resulting from chronic obstructive pulmonary disease or disease extension. So prediction of pulmonary function after lung resection is very important in thoracic surgeon. We studied the accuracy of the prediction of postoperative pulmonary function using perfusion lung scan with 99m technetium macroaggregated albumin in 22 patients who received the pneumonectomy. The linear regression line derived from correlation between predicting[X and postoperative measured[Y values of FEV1 and FVC in patients are as follows: 1 Y[ml =0.713X + 381 in FEV1 [r=0.719 ,[P<0.01 2 Y[ml =0.645X + 556 in FVC [r=0.675 ,[P<0.01 In conclusion,the perfusion lung scan is noninvasive and very accurate for predicting postpneumonectomy pulmonary function.

  • PDF

DNA methylation-based age prediction from various tissues and body fluids

  • Jung, Sang-Eun;Shin, Kyoung-Jin;Lee, Hwan Young
    • BMB Reports
    • /
    • 제50권11호
    • /
    • pp.546-553
    • /
    • 2017
  • Aging is a natural and gradual process in human life. It is influenced by heredity, environment, lifestyle, and disease. DNA methylation varies with age, and the ability to predict the age of donor using DNA from evidence materials at a crime scene is of considerable value in forensic investigations. Recently, many studies have reported age prediction models based on DNA methylation from various tissues and body fluids. Those models seem to be very promising because of their high prediction accuracies. In this review, the changes of age-associated DNA methylation and the age prediction models for various tissues and body fluids were examined, and then the applicability of the DNA methylation-based age prediction method to the forensic investigations was discussed. This will improve the understandings about DNA methylation markers and their potential to be used as biomarkers in the forensic field, as well as the clinical field.

A Study on the Comparison of Predictive Models of Cardiovascular Disease Incidence Based on Machine Learning

  • Ji Woo SEOK;Won ro LEE;Min Soo KANG
    • 한국인공지능학회지
    • /
    • 제11권1호
    • /
    • pp.1-7
    • /
    • 2023
  • In this paper, a study was conducted to compare the prediction model of cardiovascular disease occurrence. It is the No.1 disease that accounts for 1/3 of the world's causes of death, and it is also the No. 2 cause of death in Korea. Primary prevention is the most important factor in preventing cardiovascular diseases before they occur. Early diagnosis and treatment are also more important, as they play a role in reducing mortality and morbidity. The Results of an experiment using Azure ML, Logistic Regression showed 88.6% accuracy, Decision Tree showed 86.4% accuracy, and Support Vector Machine (SVM) showed 83.7% accuracy. In addition to the accuracy of the ROC curve, AUC is 94.5%, 93%, and 92.4%, indicating that the performance of the machine learning algorithm model is suitable, and among them, the results of applying the logistic regression algorithm model are the most accurate. Through this paper, visualization by comparing the algorithms can serve as an objective assistant for diagnosis and guide the direction of diagnosis made by doctors in the actual medical field.