• Title/Summary/Keyword: Predictive Accuracy

Search Result 814, Processing Time 0.028 seconds

Identification of High-risk Groups of Suicide from the Depressed Elderly using Decision Tree Analysis (의사결정나무 분석법을 이용한 우울 노인 중 자살 고위험군 규명)

  • Hong, Sehoon;Lee, Dongwon
    • Research in Community and Public Health Nursing
    • /
    • v.30 no.2
    • /
    • pp.130-140
    • /
    • 2019
  • Purpose: The aim of this study is to explore levels of suicidal ideation and identify subgroups of high suicidal risk among the depressed elderly in Korea. Methods: A descriptive cross-sectional design was adopted on secondary data from the 6th (1st year) Korean national health and nutrition examination survey (KNHANES). A total of 239 depressed elders aged 60 or over who participated in the KNHANES. The prevalence of suicidal ideation and its related factors, including sociodemographic, physical, psychological characteristics and quality of life (EQ-5D index) were examined. Descriptive statistics and a decision tree analysis were performed using the SPSS/WIN 23.0 and SPSS Modeler 14.2 programs. Results: Of the depressed elderly, 28.9% had suicidal ideation. Three groups with high suicidal ideation were identified. Predictive factors included perceived stress level, household income level, quality of life and restriction of activity. In the highest risk group were those depressed elderly with moderate and low levels of stress, less than .71 of EQ-5D index and restriction of activity, and 80.0% of these participants had suicidal ideation. The accuracy of the model was 80.8%, its sensitivity 85.9%, and its specificity 68.1%. Conclusion: Multi-dimensional intervention should be designed to decrease suicide among the depressed elderly, particularly focusing on subgroups with high risk factors. This research is expected to contribute itself to the policy design and solution building in the future as it suggests policy implications in preventing the suicide of the depressed elderly.

Machine Learning based Firm Value Prediction Model: using Online Firm Reviews (머신러닝 기반의 기업가치 예측 모형: 온라인 기업리뷰를 활용하여)

  • Lee, Hanjun;Shin, Dongwon;Kim, Hee-Eun
    • Journal of Internet Computing and Services
    • /
    • v.22 no.5
    • /
    • pp.79-86
    • /
    • 2021
  • As the usefulness of big data analysis has been drawing attention, many studies in the business research area begin to use big data to predict firm performance. Previous studies mainly rely on data outside of the firm through news articles and social media platforms. The voices within the firm in the form of employee satisfaction or evaluation of the strength and weakness of the firm can potentially affect firm value. However, there is insufficient evidence that online employee reviews are valid to predict firm value because the data is relatively difficult to obtain. To fill this gap, from 2014 to 2019, we employed 97,216 reviews collected by JobPlanet, an online firm review website in Korea, and developed a machine learning-based predictive model. Among the proposed models, the LSTM-based model showed the highest accuracy at 73.2%, and the MAE showed the lowest error at 0.359. We expect that this study can be a useful case in the field of firm value prediction on domestic companies.

Usefulness of Triglyceride and Glucose Index to Predict the Risk of Hyperuricemia in Korean Adults (한국 성인에서 고요산혈증 위험을 예측하기 위한 중성지방-혈당 지수의 유용성)

  • Shin, Kyung-A;Kim, Eun Jae
    • Journal of the Korea Convergence Society
    • /
    • v.11 no.12
    • /
    • pp.283-290
    • /
    • 2020
  • The purpose of this study was to evaluate the usefulness of the triglyceride and glucose(TyG) index to predict the risk of hyperuricemia in Korean adults. This study included 14,266 men and 9,033 women over 20 years old who underwent health screenings from 2017 to 2019 at a general hospital in Seoul. To confirm the risk of hyperuricemia and predictive ability of the TyG index, logistic regression analysis and ROC curves were obtained. The accuracy of the TyG index for predicting hyperuricemia was 0.68, 0.61 for men and 0.67 for women(respectively p<0.001). The risk of hyperuricemia in the TyG index was 1.69 times higher in the fourth quartile than in the first quartile, 2.03 times higher in men and 2.07 times higher in women(respectively p<0.05). Thus the TyG index was not of high diagnostic usefulness as a screening test for hyperuricemia, but it was related to the TyG index and hyperuricemia.

Analysis of Occupational Injury and Feature Importance of Fall Accidents on the Construction Sites using Adaboost (에이다 부스트를 활용한 건설현장 추락재해의 강도 예측과 영향요인 분석)

  • Choi, Jaehyun;Ryu, HanGuk
    • Journal of the Architectural Institute of Korea Structure & Construction
    • /
    • v.35 no.11
    • /
    • pp.155-162
    • /
    • 2019
  • The construction industry is the highest safety accident causing industry as 28.55% portion of all industries' accidents in Korea. In particular, falling is the highest accidents type composed of 60.16% among the construction field accidents. Therefore, we analyzed the factors of major disaster affecting the fall accident and then derived feature importances by considering various variables. We used data collected from Korea Occupational Safety & Health Agency (KOSHA) for learning and predicting in the proposed model. We have an effort to predict the degree of occupational fall accidents by using the machine learning model, i.e., Adaboost, short for Adaptive Boosting. Adaboost is a machine learning meta-algorithm which can be used in conjunction with many other types of learning algorithms to improve performance. Decision trees were combined with AdaBoost in this model to predict and classify the degree of occupational fall accidents. HyOperpt was also used to optimize hyperparameters and to combine k-fold cross validation by hierarchy. We extracted and analyzed feature importances and affecting fall disaster by permutation technique. In this study, we verified the degree of fall accidents with predictive accuracy. The machine learning model was also confirmed to be applicable to the safety accident analysis in construction site. In the future, if the safety accident data is accumulated automatically in the network system using IoT(Internet of things) technology in real time in the construction site, it will be possible to analyze the factors and types of accidents according to the site conditions from the real time data.

Estimating the compressive strength of HPFRC containing metallic fibers using statistical methods and ANNs

  • Perumal, Ramadoss;Prabakaran, V.
    • Advances in concrete construction
    • /
    • v.10 no.6
    • /
    • pp.479-488
    • /
    • 2020
  • The experimental and numerical works were carried out on high performance fiber reinforced concrete (HPFRC) with w/cm ratios ranging from 0.25 to 0.40, fiber volume fraction (Vf)=0-1.5% and 10% silica fume replacement. Improvements in compressive and flexural strengths obtained for HPFRC are moderate and significant, respectively, Empirical equations developed for the compressive strength and flexural strength of HPFRC as a function of fiber volume fraction. A relation between flexural strength and compressive strength of HPFRC with R=0.78 was developed. Due to the complex mix proportions and non-linear relationship between the mix proportions and properties, models with reliable predictive capabilities are not developed and also research on HPFRC was empirical. In this paper due to the inadequacy of present method, a back propagation-neural network (BP-NN) was employed to estimate the 28-day compressive strength of HPFRC mixes. BP-NN model was built to implement the highly non-linear relationship between the mix proportions and their properties. This paper describes the data sets collected, training of ANNs and comparison of the experimental results obtained for various mixtures. On statistical analyses of collected data, a multiple linear regression (MLR) model with R2=0.78 was developed for the prediction of compressive strength of HPFRC mixes, and average absolute error (AAE) obtained is 6.5%. On validation of the data sets by NNs, the error range was within 2% of the actual values. ANN model has given the significant degree of accuracy and reliability compared to the MLR model. ANN approach can be effectively used to estimate the 28-day compressive strength of fibrous concrete mixes and is practical.

Application of sequence to sequence learning based LSTM model (LSTM-s2s) for forecasting dam inflow (Sequence to Sequence based LSTM (LSTM-s2s)모형을 이용한 댐유입량 예측에 대한 연구)

  • Han, Heechan;Choi, Changhyun;Jung, Jaewon;Kim, Hung Soo
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.3
    • /
    • pp.157-166
    • /
    • 2021
  • Forecasting dam inflow based on high reliability is required for efficient dam operation. In this study, deep learning technique, which is one of the data-driven methods and has been used in many fields of research, was manipulated to predict the dam inflow. The Long Short-Term Memory deep learning with Sequence-to-Sequence model (LSTM-s2s), which provides high performance in predicting time-series data, was applied for forecasting inflow of Soyang River dam. Various statistical metrics or evaluation indicators, including correlation coefficient (CC), Nash-Sutcliffe efficiency coefficient (NSE), percent bias (PBIAS), and error in peak value (PE), were used to evaluate the predictive performance of the model. The result of this study presented that the LSTM-s2s model showed high accuracy in the prediction of dam inflow and also provided good performance for runoff event based runoff prediction. It was found that the deep learning based approach could be used for efficient dam operation for water resource management during wet and dry seasons.

Comparison of CT Exposure Dose Prediction Models Using Machine Learning-based Body Measurement Information (머신러닝 기반 신체 계측정보를 이용한 CT 피폭선량 예측모델 비교)

  • Hong, Dong-Hee
    • Journal of radiological science and technology
    • /
    • v.43 no.6
    • /
    • pp.503-509
    • /
    • 2020
  • This study aims to develop a patient-specific radiation exposure dose prediction model based on anthropometric data that can be easily measurable during CT examination, and to be used as basic data for DRL setting and radiation dose management system in the future. In addition, among the machine learning algorithms, the most suitable model for predicting exposure doses is presented. The data used in this study were chest CT scan data, and a data set was constructed based on the data including the patient's anthropometric data. In the pre-processing and sample selection of the data, out of the total number of samples of 250 samples, only chest CT scans were performed without using a contrast agent, and 110 samples including height and weight variables were extracted. Of the 110 samples extracted, 66% was used as a training set, and the remaining 44% were used as a test set for verification. The exposure dose was predicted through random forest, linear regression analysis, and SVM algorithm using Orange version 3.26.0, an open software as a machine learning algorithm. Results Algorithm model prediction accuracy was R^2 0.840 for random forest, R^2 0.969 for linear regression analysis, and R^2 0.189 for SVM. As a result of verifying the prediction rate of the algorithm model, the random forest is the highest with R^2 0.986 of the random forest, R^2 0.973 of the linear regression analysis, and R^2 of 0.204 of the SVM, indicating that the model has the best predictive power.

Quality Prediction Model for Manufacturing Process of Free-Machining 303-series Stainless Steel Small Rolling Wire Rods (쾌삭 303계 스테인리스강 소형 압연 선재 제조 공정의 생산품질 예측 모형)

  • Seo, Seokjun;Kim, Heungseob
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.4
    • /
    • pp.12-22
    • /
    • 2021
  • This article suggests the machine learning model, i.e., classifier, for predicting the production quality of free-machining 303-series stainless steel(STS303) small rolling wire rods according to the operating condition of the manufacturing process. For the development of the classifier, manufacturing data for 37 operating variables were collected from the manufacturing execution system(MES) of Company S, and the 12 types of derived variables were generated based on literature review and interviews with field experts. This research was performed with data preprocessing, exploratory data analysis, feature selection, machine learning modeling, and the evaluation of alternative models. In the preprocessing stage, missing values and outliers are removed, and oversampling using SMOTE(Synthetic oversampling technique) to resolve data imbalance. Features are selected by variable importance of LASSO(Least absolute shrinkage and selection operator) regression, extreme gradient boosting(XGBoost), and random forest models. Finally, logistic regression, support vector machine(SVM), random forest, and XGBoost are developed as a classifier to predict the adequate or defective products with new operating conditions. The optimal hyper-parameters for each model are investigated by the grid search and random search methods based on k-fold cross-validation. As a result of the experiment, XGBoost showed relatively high predictive performance compared to other models with an accuracy of 0.9929, specificity of 0.9372, F1-score of 0.9963, and logarithmic loss of 0.0209. The classifier developed in this study is expected to improve productivity by enabling effective management of the manufacturing process for the STS303 small rolling wire rods.

Predicting patient experience of Invisalign treatment: An analysis using artificial neural network

  • Xu, Lin;Mei, Li;Lu, Ruiqi;Li, Yuan;Li, Hanshi;Li, Yu
    • The korean journal of orthodontics
    • /
    • v.52 no.4
    • /
    • pp.268-277
    • /
    • 2022
  • Objective: Poor experience with Invisalign treatment affects patient compliance and, thus, treatment outcome. Knowing the potential discomfort level in advance can help orthodontists better prepare the patient to overcome the difficult stage. This study aimed to construct artificial neural networks (ANNs) to predict patient experience in the early stages of Invisalign treatment. Methods: In total, 196 patients were enrolled. Data collection included questionnaires on pain, anxiety, and quality of life (QoL). A four-layer fully connected multilayer perception with three backpropagations was constructed to predict patient experience of the treatment. The input data comprised 17 clinical features. The partial derivative method was used to calculate the relative contributions of each input in the ANNs. Results: The predictive success rates for pain, anxiety, and QoL were 87.7%, 93.4%, and 92.4%, respectively. ANNs for predicting pain, anxiety, and QoL yielded areas under the curve of 0.963, 0.992, and 0.982, respectively. The number of teeth with lingual attachments was the most important factor affecting the outcome of negative experience, followed by the number of lingual buttons and upper incisors with attachments. Conclusions: The constructed ANNs in this preliminary study show good accuracy in predicting patient experience (i.e., pain, anxiety, and QoL) of Invisalign treatment. Artificial intelligence system developed for predicting patient comfort has potential for clinical application to enhance patient compliance.

Prediction of Food Franchise Success and Failure Based on Machine Learning (머신러닝 기반 외식업 프랜차이즈 가맹점 성패 예측)

  • Ahn, Yelyn;Ryu, Sungmin;Lee, Hyunhee;Park, Minseo
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.4
    • /
    • pp.347-353
    • /
    • 2022
  • In the restaurant industry, start-ups are active due to high demand from consumers and low entry barriers. However, the restaurant industry has a high closure rate, and in the case of franchises, there is a large deviation in sales within the same brand. Thus, research is needed to prevent the closure of food franchises. Therefore, this study examines the factors affecting franchise sales and uses machine learning techniques to predict the success and failure of franchises. Various factors that affect franchise sales are extracted by using Point of Sale (PoS) data of food franchise and public data in Gangnam-gu, Seoul. And for more valid variable selection, multicollinearity is removed by using Variance Inflation Factor (VIF). Finally, classification models are used to predict the success and failure of food franchise stores. Through this method, we propose success and failure prediction model for food franchise stores with the accuracy of 0.92.