• Title/Summary/Keyword: Predictive Power

Search Result 710, Processing Time 0.031 seconds

Comparison of CT Exposure Dose Prediction Models Using Machine Learning-based Body Measurement Information (머신러닝 기반 신체 계측정보를 이용한 CT 피폭선량 예측모델 비교)

  • Hong, Dong-Hee
    • Journal of radiological science and technology
    • /
    • v.43 no.6
    • /
    • pp.503-509
    • /
    • 2020
  • This study aims to develop a patient-specific radiation exposure dose prediction model based on anthropometric data that can be easily measurable during CT examination, and to be used as basic data for DRL setting and radiation dose management system in the future. In addition, among the machine learning algorithms, the most suitable model for predicting exposure doses is presented. The data used in this study were chest CT scan data, and a data set was constructed based on the data including the patient's anthropometric data. In the pre-processing and sample selection of the data, out of the total number of samples of 250 samples, only chest CT scans were performed without using a contrast agent, and 110 samples including height and weight variables were extracted. Of the 110 samples extracted, 66% was used as a training set, and the remaining 44% were used as a test set for verification. The exposure dose was predicted through random forest, linear regression analysis, and SVM algorithm using Orange version 3.26.0, an open software as a machine learning algorithm. Results Algorithm model prediction accuracy was R^2 0.840 for random forest, R^2 0.969 for linear regression analysis, and R^2 0.189 for SVM. As a result of verifying the prediction rate of the algorithm model, the random forest is the highest with R^2 0.986 of the random forest, R^2 0.973 of the linear regression analysis, and R^2 of 0.204 of the SVM, indicating that the model has the best predictive power.

Study on Improvement of Frost Occurrence Prediction Accuracy (서리발생 예측 정확도 향상을 위한 방법 연구)

  • Kim, Yongseok;Choi, Wonjun;Shim, Kyo-moon;Hur, Jina;Kang, Mingu;Jo, Sera
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.4
    • /
    • pp.295-305
    • /
    • 2021
  • In this study, we constructed using Random Forest(RF) by selecting the meteorological factors related to the occurrence of frost. As a result, when constructing a classification model for frost occurrence, even if the amount of data set is large, the imbalance in the data set for development of model has been analyzed to have a bad effect on the predictive power of the model. It was found that building a single integrated model by grouping meteorological factors related to frost occurrence by region is more efficient than building each model reflecting high-importance meteorological factors. Based on our results, it is expected that a high-accuracy frost occurrence prediction model will be able to be constructed as further studies meteorological factors for frost prediction.

A Study on the Employee Turnover Prediction using XGBoost and SHAP (XGBoost와 SHAP 기법을 활용한 근로자 이직 예측에 관한 연구)

  • Lee, Jae Jun;Lee, Yu Rin;Lim, Do Hyun;Ahn, Hyun Chul
    • The Journal of Information Systems
    • /
    • v.30 no.4
    • /
    • pp.21-42
    • /
    • 2021
  • Purpose In order for companies to continue to grow, they should properly manage human resources, which are the core of corporate competitiveness. Employee turnover means the loss of talent in the workforce. When an employee voluntarily leaves his or her company, it will lose hiring and training cost and lead to the withdrawal of key personnel and new costs to train a new employee. From an employee's viewpoint, moving to another company is also risky because it can be time consuming and costly. Therefore, in order to reduce the social and economic costs caused by employee turnover, it is necessary to accurately predict employee turnover intention, identify the factors affecting employee turnover, and manage them appropriately in the company. Design/methodology/approach Prior studies have mainly used logistic regression and decision trees, which have explanatory power but poor predictive accuracy. In order to develop a more accurate prediction model, XGBoost is proposed as the classification technique. Then, to compensate for the lack of explainability, SHAP, one of the XAI techniques, is applied. As a result, the prediction accuracy of the proposed model is improved compared to the conventional methods such as LOGIT and Decision Trees. By applying SHAP to the proposed model, the factors affecting the overall employee turnover intention as well as a specific sample's turnover intention are identified. Findings Experimental results show that the prediction accuracy of XGBoost is superior to that of logistic regression and decision trees. Using SHAP, we find that jobseeking, annuity, eng_test, comm_temp, seti_dev, seti_money, equl_ablt, and sati_safe significantly affect overall employee turnover intention. In addition, it is confirmed that the factors affecting an individual's turnover intention are more diverse. Our research findings imply that companies should adopt a personalized approach for each employee in order to effectively prevent his or her turnover.

Comparison of Machine Learning Classification Models for the Development of Simulators for General X-ray Examination Education (일반엑스선검사 교육용 시뮬레이터 개발을 위한 기계학습 분류모델 비교)

  • Lee, In-Ja;Park, Chae-Yeon;Lee, Jun-Ho
    • Journal of radiological science and technology
    • /
    • v.45 no.2
    • /
    • pp.111-116
    • /
    • 2022
  • In this study, the applicability of machine learning for the development of a simulator for general X-ray examination education is evaluated. To this end, k-nearest neighbor(kNN), support vector machine(SVM) and neural network(NN) classification models are analyzed to present the most suitable model by analyzing the results. Image data was obtained by taking 100 photos each corresponding to Posterior anterior(PA), Posterior anterior oblique(Obl), Lateral(Lat), Fan lateral(Fan lat). 70% of the acquired 400 image data were used as training sets for learning machine learning models and 30% were used as test sets for evaluation. and prediction model was constructed for right-handed PA, Obl, Lat, Fan lat image classification. Based on the data set, after constructing the classification model using the kNN, SVM, and NN models, each model was compared through an error matrix. As a result of the evaluation, the accuracy of kNN was 0.967 area under curve(AUC) was 0.993, and the accuracy of SVM was 0.992 AUC was 1.000. The accuracy of NN was 0.992 and AUC was 0.999, which was slightly lower in kNN, but all three models recorded high accuracy and AUC. In this study, right-handed PA, Obl, Lat, Fan lat images were classified and predicted using the machine learning classification models, kNN, SVM, and NN models. The prediction showed that SVM and NN were the same at 0.992, and AUC was similar at 1.000 and 0.999, indicating that both models showed high predictive power and were applicable to educational simulators.

A multi-layer approach to DN 50 electric valve fault diagnosis using shallow-deep intelligent models

  • Liu, Yong-kuo;Zhou, Wen;Ayodeji, Abiodun;Zhou, Xin-qiu;Peng, Min-jun;Chao, Nan
    • Nuclear Engineering and Technology
    • /
    • v.53 no.1
    • /
    • pp.148-163
    • /
    • 2021
  • Timely fault identification is important for safe and reliable operation of the electric valve system. Many research works have utilized different data-driven approach for fault diagnosis in complex systems. However, they do not consider specific characteristics of critical control components such as electric valves. This work presents an integrated shallow-deep fault diagnostic model, developed based on signals extracted from DN50 electric valve. First, the local optimal issue of particle swarm optimization algorithm is solved by optimizing the weight search capability, the particle speed, and position update strategy. Then, to develop a shallow diagnostic model, the modified particle swarm algorithm is combined with support vector machine to form a hybrid improved particle swarm-support vector machine (IPs-SVM). To decouple the influence of the background noise, the wavelet packet transform method is used to reconstruct the vibration signal. Thereafter, the IPs-SVM is used to classify phase imbalance and damaged valve faults, and the performance was evaluated against other models developed using the conventional SVM and particle swarm optimized SVM. Secondly, three different deep belief network (DBN) models are developed, using different acoustic signal structures: raw signal, wavelet transformed signal and time-series (sequential) signal. The models are developed to estimate internal leakage sizes in the electric valve. The predictive performance of the DBN and the evaluation results of the proposed IPs-SVM are also presented in this paper.

Prognostic Role of Circulating Tumor Cells in the Pulmonary Vein, Peripheral Blood, and Bone Marrow in Resectable Non-Small Cell Lung Cancer

  • Lee, Jeong Moon;Jung, Woohyun;Yum, Sungwon;Lee, Jeong Hoon;Cho, Sukki
    • Journal of Chest Surgery
    • /
    • v.55 no.3
    • /
    • pp.214-224
    • /
    • 2022
  • Background: Studies of the prognostic role of circulating tumor cells (CTCs) in early-stage non-small cell lung cancer (NSCLC) are still limited. This study investigated the prognostic power of CTCs from the pulmonary vein (PV), peripheral blood (PB), and bone marrow (BM) for postoperative recurrence in patients who underwent curative resection for NSCLC. Methods: Forty patients who underwent curative resection for NSCLC were enrolled. Before resection, 10-mL samples were obtained of PB from the radial artery, blood from the PV of the lobe containing the tumor, and BM aspirates from the rib. A microfabricated filter was used for CTC enrichment, and immunofluorescence staining was used to identify CTCs. Results: The pathologic stage was stage I in 8 patients (20%), II in 15 (38%), III in 14 (35%), and IV in 3 (8%). The median number of PB-, PV-, and BM-CTCs was 4, 4, and 5, respectively. A time-dependent receiver operating characteristic curve analysis showed that PB-CTCs had excellent predictive value for recurrence-free survival (RFS), with the highest area under the curve at each time point (first, second, and third quartiles of RFS). In a multivariate Cox proportional hazard regression model, PB-CTCs were an independent risk factor for recurrence (hazard ratio, 10.580; 95% confidence interval, 1.637-68.388; p<0.013). Conclusion: The presence of ≥4 PB-CTCs was an independent poor prognostic factor for RFS, and PV-CTCs and PB-CTCs had a positive linear correlation in patients with recurrence.

Validity of the scoring system for traumatic liver injury: a generalized estimating equation analysis

  • Lee, Kangho;Ryu, Dongyeon;Kim, Hohyun;Jeon, Chang Ho;Kim, Jae Hun;Park, Chan Yong;Yeom, Seok Ran
    • Journal of Trauma and Injury
    • /
    • v.35 no.1
    • /
    • pp.25-33
    • /
    • 2022
  • Purpose: The scoring system for traumatic liver injury (SSTLI) was developed in 2015 to predict mortality in patients with polytraumatic liver injury. This study aimed to validate the SSTLI as a prognostic factor in patients with polytrauma and liver injury through a generalized estimating equation analysis. Methods: The medical records of 521 patients with traumatic liver injury from January 2015 to December 2019 were reviewed. The primary outcome variable was in-hospital mortality. All the risk factors were analyzed using multivariate logistic regression analysis. The SSTLI has five clinical measures (age, Injury Severity Score, serum total bilirubin level, prothrombin time, and creatinine level) chosen based on their predictive power. Each measure is scored as 0-1 (age and Injury Severity Score) or 0-3 (serum total bilirubin level, prothrombin time, and creatinine level). The SSTLI score corresponds to the total points for each item (0-11 points). Results: The areas under the curve of the SSTLI to predict mortality on post-traumatic days 0, 1, 3, and 5 were 0.736, 0.783, 0.830, and 0.824, respectively. A very good to excellent positive correlation was observed between the probability of mortality and the SSTLI score (γ=0.997, P<0.001). A value of 5 points was used as the threshold to distinguish low-risk (<5) from high-risk (≥5) patients. Multivariate analysis using the generalized estimating equation in the logistic regression model indicated that the SSTLI score was an independent predictor of mortality (odds ratio, 1.027; 95% confidence interval, 1.018-1.036; P<0.001). Conclusions: The SSTLI was verified to predict mortality in patients with polytrauma and liver injury. A score of ≥5 on the SSTLI indicated a high-risk of post-traumatic mortality.

A Study on Wartime OPCON Transfer Policy Changes Applied Kingdon's Policy Model - Focussing on Administrations of Roh Moo Hyun and Lee Myoung Bak - (Kingdon모형을 적용한 전시 작전통제권 전환 정책변동에 관한 연구 노무현 정부, 이명박 정부를 중심으로-)

  • Lee, JeongHoon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.5
    • /
    • pp.291-295
    • /
    • 2022
  • The transition to wartime operational control during the term of office, which was the promise of the Moon Jae Inn administration, fell through. More than 70 years after it was transferred during the Korean War in 1950, the policy of converting wartime operational control has been repeatedly decided and reversed several times. This conversion of wartime operational control is a national policy directly related to our security, and it is most important to understand the determinants of the administration's conversion to wartime operational control. This paper selects two cases of adjustment of wartime operational control policy during the Lee Myung Bak administration in 2006 and 2010 during the Roh Moo Hyun administration as the subject of the study and expects to gain not only policy predictive power but also successful policy execution at the time of the two administration' policy changes.

Comparative Analysis on the Performance of NHPP Software Reliability Model with Exponential Distribution Characteristics (지수분포 특성을 갖는 NHPP 소프트웨어 신뢰성 모형의 성능 비교 분석)

  • Park, Seung-Kyu
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.4
    • /
    • pp.641-648
    • /
    • 2022
  • In this study, the performance of the NHPP software reliability model with exponential distribution (Exponential Basic, Inverse Exponential, Lindley, Rayleigh) characteristics was comparatively analyzed, and based on this, the optimal reliability model was also presented. To analyze the software failure phenomenon, the failure time data collected during system operation was used, and the parameter estimation was solved by applying the maximum likelihood estimation method (MLE). Through various comparative analysis (mean square error analysis, true value predictive power analysis of average value function, strength function evaluation, and reliability evaluation applied with mission time), it was found that the Lindley model was an efficient model with the best performance. Through this study, the reliability performance of the distribution with the characteristic of the exponential form, which has no existing research case, was newly identified, and through this, basic design data that software developers could use in the initial stage can be presented.

Evaluation and estimation of the number of pigs raised and slaughtered using the traceability of animal products

  • Sukho Han
    • Korean Journal of Agricultural Science
    • /
    • v.49 no.1
    • /
    • pp.61-75
    • /
    • 2022
  • The first purpose of this study is to evaluate the usefulness of pork traceability data, which is monthly time-series data, and to draw implications with regard to its usefulness. The second purpose is to construct a dynamic ecological equation model (DEEM) that reflects the biological characteristics at each growth stage, such as pregnancy, birth and growth, and the slaughter of pigs, using traceability data. With the monthly pig model devised in this study, it is expected that the number of slaughtered animals (supply) that can be shipped in the future is predictable and that policy simulations are possible. However, this study was limited to traceability data and focused only on building a supply-side model. As a result of verifying the traceability data, it was found that approximately 6% of farms produce by mixing great grand parent (GGP), grand parent (GP), parent stock (PS), and artificial insemination (AI), meaning that it is necessary to separate them by business type. However, the analysis also showed that the coefficient values estimated by constructing an equation for each growth stage were consistent with the pig growth outcomes. Also, the model predictive power test was excellent. For this reason, it is judged that the model design and traceability data constructed with the cohort and the dynamic ecological equation model system considering biological growth and shipment times are excellent. Finally, the model constructed in this study is expected to be used as basic data to inform producers in their decision-making activities and to help with governmental policy directions with regard to supply and demand. Research on the demand side is left for future researchers.