• Title/Summary/Keyword: predictive ability

Search Result 290, Processing Time 0.029 seconds

Development and validation of prediction equations for the assessment of muscle or fat mass using anthropometric measurements, serum creatinine level, and lifestyle factors among Korean adults

  • Lee, Gyeongsil;Chang, Jooyoung;Hwang, Seung-sik;Son, Joung Sik;Park, Sang Min
    • Nutrition Research and Practice
    • /
    • v.15 no.1
    • /
    • pp.95-105
    • /
    • 2021
  • BACKGROUND/OBJECTIVES: The measurement of body composition, including muscle and fat mass, remains challenging in large epidemiological studies due to time constraint and cost when using accurate modalities. Therefore, this study aimed to develop and validate prediction equations according to sex to measure lean body mass (LBM), appendicular skeletal muscle mass (ASM), and body fat mass (BFM) using anthropometric measurement, serum creatinine level, and lifestyle factors as independent variables and dual-energy X-ray absorptiometry as the reference method. SUBJECTS/METHODS: A sample of the Korean general adult population (men: 7,599; women: 10,009) from the Korean National Health and Nutrition Examination Survey 2008-2011 was included in this study. The participants were divided into the derivation and validation groups via a random number generator (with a ratio of 70:30). The prediction equations were developed using a series of multivariable linear regressions and validated using the Bland-Altman plot and intraclass correlation coefficient (ICC). RESULTS: The initial and practical equations that included age, height, weight, and waist circumference had a different predictive ability for LBM (men: R2 = 0.85, standard error of estimate [SEE] = 2.7 kg; women: R2 = 0.78, SEE = 2.2 kg), ASM (men: R2 = 0.81, SEE = 1.6 kg; women: R2 = 0.71, SEE = 1.2 kg), and BFM (men: R2 = 0.74, SEE = 2.7 kg; women: R2 = 0.83, SEE = 2.2 kg) according to sex. Compared with the first prediction equation, the addition of other factors, including serum creatinine level, physical activity, smoking status, and alcohol use, resulted in an R2 that is higher by 0.01 and SEE that is lower by 0.1. CONCLUSIONS: All equations had low bias, moderate agreement based on the Bland-Altman plot, and high ICC, and this result showed that these equations can be further applied to other epidemiologic studies.

Personalized Diabetes Risk Assessment Through Multifaceted Analysis (PD- RAMA): A Novel Machine Learning Approach to Early Detection and Management of Type 2 Diabetes

  • Gharbi Alshammari
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.17-25
    • /
    • 2023
  • The alarming global prevalence of Type 2 Diabetes Mellitus (T2DM) has catalyzed an urgent need for robust, early diagnostic methodologies. This study unveils a pioneering approach to predicting T2DM, employing the Extreme Gradient Boosting (XGBoost) algorithm, renowned for its predictive accuracy and computational efficiency. The investigation harnesses a meticulously curated dataset of 4303 samples, extracted from a comprehensive Chinese research study, scrupulously aligned with the World Health Organization's indicators and standards. The dataset encapsulates a multifaceted spectrum of clinical, demographic, and lifestyle attributes. Through an intricate process of hyperparameter optimization, the XGBoost model exhibited an unparalleled best score, elucidating a distinctive combination of parameters such as a learning rate of 0.1, max depth of 3, 150 estimators, and specific colsample strategies. The model's validation accuracy of 0.957, coupled with a sensitivity of 0.9898 and specificity of 0.8897, underlines its robustness in classifying T2DM. A detailed analysis of the confusion matrix further substantiated the model's diagnostic prowess, with an F1-score of 0.9308, illustrating its balanced performance in true positive and negative classifications. The precision and recall metrics provided nuanced insights into the model's ability to minimize false predictions, thereby enhancing its clinical applicability. The research findings not only underline the remarkable efficacy of XGBoost in T2DM prediction but also contribute to the burgeoning field of machine learning applications in personalized healthcare. By elucidating a novel paradigm that accentuates the synergistic integration of multifaceted clinical parameters, this study fosters a promising avenue for precise early detection, risk stratification, and patient-centric intervention in diabetes care. The research serves as a beacon, inspiring further exploration and innovation in leveraging advanced analytical techniques for transformative impacts on predictive diagnostics and chronic disease management.

A Study on Predictive Modeling of I-131 Radioactivity Based on Machine Learning (머신러닝 기반 고용량 I-131의 용량 예측 모델에 관한 연구)

  • Yeon-Wook You;Chung-Wun Lee;Jung-Soo Kim
    • Journal of radiological science and technology
    • /
    • v.46 no.2
    • /
    • pp.131-139
    • /
    • 2023
  • High-dose I-131 used for the treatment of thyroid cancer causes localized exposure among radiology technologists handling it. There is a delay between the calibration date and when the dose of I-131 is administered to a patient. Therefore, it is necessary to directly measure the radioactivity of the administered dose using a dose calibrator. In this study, we attempted to apply machine learning modeling to measured external dose rates from shielded I-131 in order to predict their radioactivity. External dose rates were measured at 1 m, 0.3 m, and 0.1 m distances from a shielded container with the I-131, with a total of 868 sets of measurements taken. For the modeling process, we utilized the hold-out method to partition the data with a 7:3 ratio (609 for the training set:259 for the test set). For the machine learning algorithms, we chose linear regression, decision tree, random forest and XGBoost. To evaluate the models, we calculated root mean square error (RMSE), mean square error (MSE), and mean absolute error (MAE) to evaluate accuracy and R2 to evaluate explanatory power. Evaluation results are as follows. Linear regression (RMSE 268.15, MSE 71901.87, MAE 231.68, R2 0.92), decision tree (RMSE 108.89, MSE 11856.92, MAE 19.24, R2 0.99), random forest (RMSE 8.89, MSE 79.10, MAE 6.55, R2 0.99), XGBoost (RMSE 10.21, MSE 104.22, MAE 7.68, R2 0.99). The random forest model achieved the highest predictive ability. Improving the model's performance in the future is expected to contribute to lowering exposure among radiology technologists.

A Study on the Prediction Model for Analysis of Water Quality in Gwangju Stream using Machine Learning Algorithm (머신러닝 학습 알고리즘을 이용한 광주천 수질 분석에 대한 예측 모델 연구)

  • Yu-Jeong Jeong;Jung-Jae Lee
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.19 no.3
    • /
    • pp.531-538
    • /
    • 2024
  • While the importance of the water quality environment is being emphasized, the water quality index for improving the water quality of urban rivers in Gwangju Metropolitan City is an important factor affecting the aquatic ecosystem and requires accurate prediction. In this paper, the XGBoost and LightGBM machine learning algorithms were used to compare the performance of the water quality inspection items of the downstream Pyeongchon Bridge and upstream BanghakBr_Gwangjucheon1 water systems, which are important points of Gwangju Stream, as a result of statistical verification, three water quality indicators, Nitrogen(TN), Nitrate(NO3), and Ammonia amount(NH3) were predicted, and the performance of the predictive model was evaluated by using RMSE, a regression model evaluation index. As a result of comparing the performance after cross-validation by implementing individual models for each water system, the XGBoost model showed excellent predictive ability.

A study on the use of a Business Intelligence system : the role of explanations (비즈니스 인텔리전스 시스템의 활용 방안에 관한 연구: 설명 기능을 중심으로)

  • Kwon, YoungOk
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.155-169
    • /
    • 2014
  • With the rapid advances in technologies, organizations are more likely to depend on information systems in their decision-making processes. Business Intelligence (BI) systems, in particular, have become a mainstay in dealing with complex problems in an organization, partly because a variety of advanced computational methods from statistics, machine learning, and artificial intelligence can be applied to solve business problems such as demand forecasting. In addition to the ability to analyze past and present trends, these predictive analytics capabilities provide huge value to an organization's ability to respond to change in markets, business risks, and customer trends. While the performance effects of BI system use in organization settings have been studied, it has been little discussed on the use of predictive analytics technologies embedded in BI systems for forecasting tasks. Thus, this study aims to find important factors that can help to take advantage of the benefits of advanced technologies of a BI system. More generally, a BI system can be viewed as an advisor, defined as the one that formulates judgments or recommends alternatives and communicates these to the person in the role of the judge, and the information generated by the BI system as advice that a decision maker (judge) can follow. Thus, we refer to the findings from the advice-giving and advice-taking literature, focusing on the role of explanations of the system in users' advice taking. It has been shown that advice discounting could occur when an advisor's reasoning or evidence justifying the advisor's decision is not available. However, the majority of current BI systems merely provide a number, which may influence decision makers in accepting the advice and inferring the quality of advice. We in this study explore the following key factors that can influence users' advice taking within the setting of a BI system: explanations on how the box-office grosses are predicted, types of advisor, i.e., system (data mining technique) or human-based business advice mechanisms such as prediction markets (aggregated human advice) and human advisors (individual human expert advice), users' evaluations of the provided advice, and individual differences in decision-makers. Each subject performs the following four tasks, by going through a series of display screens on the computer. First, given the information of the given movie such as director and genre, the subjects are asked to predict the opening weekend box office of the movie. Second, in light of the information generated by an advisor, the subjects are asked to adjust their original predictions, if they desire to do so. Third, they are asked to evaluate the value of the given information (e.g., perceived usefulness, trust, satisfaction). Lastly, a short survey is conducted to identify individual differences that may affect advice-taking. The results from the experiment show that subjects are more likely to follow system-generated advice than human advice when the advice is provided with an explanation. When the subjects as system users think the information provided by the system is useful, they are also more likely to take the advice. In addition, individual differences affect advice-taking. The subjects with more expertise on advisors or that tend to agree with others adjust their predictions, following the advice. On the other hand, the subjects with more knowledge on movies are less affected by the advice and their final decisions are close to their original predictions. The advances in predictive analytics of a BI system demonstrate a great potential to support increasingly complex business decisions. This study shows how the designs of a BI system can play a role in influencing users' acceptance of the system-generated advice, and the findings provide valuable insights on how to leverage the advanced predictive analytics of the BI system in an organization's forecasting practices.

A Method for Selecting Software Reliability Growth Models Using Partial Data (부분 데이터를 이용한 신뢰도 성장 모델 선택 방법)

  • Park, Yong Jun;Min, Bup-Ki;Kim, Hyeon Soo
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.1
    • /
    • pp.9-18
    • /
    • 2015
  • Software Reliability Growth Models (SRGMs) are useful for determining the software release date or additional testing efforts by using software failure data. It is not appropriate for a SRGM to apply to all software. And besides a large number of SRGMs have already been proposed to estimate software reliability measures. Therefore selection of an optimal SRGM for use in a particular case has been an important issue. The existing methods for selecting a SRGM use the entire collected failure data. However, initial failure data may not affect the future failure occurrence and, in some cases, it results in the distorted result when evaluating the future failure. In this paper, we suggest a method for selecting a SRGM based on the evaluation goodness-of-fit using partial data. Our approach uses partial data except for inordinately unstable failure data in the entire failure data. We will find a portion of data used to select a SRGM through the comparison between the entire failure data and the partial failure data excluded the initial failure data with respect to the predictive ability of future failures. To justify our approach this paper shows that the predictive ability of future failures using partial data is more accurate than using the entire failure data with the real collected failure data.

The Analysis and Design of Advanced Neurofuzzy Polynomial Networks (고급 뉴로퍼지 다항식 네트워크의 해석과 설계)

  • Park, Byeong-Jun;O, Seong-Gwon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.39 no.3
    • /
    • pp.18-31
    • /
    • 2002
  • In this study, we introduce a concept of advanced neurofuzzy polynomial networks(ANFPN), a hybrid modeling architecture combining neurofuzzy networks(NFN) and polynomial neural networks(PNN). These networks are highly nonlinear rule-based models. The development of the ANFPN dwells on the technologies of Computational Intelligence(Cl), namely fuzzy sets, neural networks and genetic algorithms. NFN contributes to the formation of the premise part of the rule-based structure of the ANFPN. The consequence part of the ANFPN is designed using PNN. At the premise part of the ANFPN, NFN uses both the simplified fuzzy inference and error back-propagation learning rule. The parameters of the membership functions, learning rates and momentum coefficients are adjusted with the use of genetic optimization. As the consequence structure of ANFPN, PNN is a flexible network architecture whose structure(topology) is developed through learning. In particular, the number of layers and nodes of the PNN are not fixed in advance but is generated in a dynamic way. In this study, we introduce two kinds of ANFPN architectures, namely the basic and the modified one. Here the basic and the modified architecture depend on the number of input variables and the order of polynomial in each layer of PNN structure. Owing to the specific features of two combined architectures, it is possible to consider the nonlinear characteristics of process system and to obtain the better output performance with superb predictive ability. The availability and feasibility of the ANFPN are discussed and illustrated with the aid of two representative numerical examples. The results show that the proposed ANFPN can produce the model with higher accuracy and predictive ability than any other method presented previously.

A Study on the Prediction Model of the Elderly Depression

  • SEO, Beom-Seok;SUH, Eung-Kyo;KIM, Tae-Hyeong
    • The Journal of Industrial Distribution & Business
    • /
    • v.11 no.7
    • /
    • pp.29-40
    • /
    • 2020
  • Purpose: In modern society, many urban problems are occurring, such as aging, hollowing out old city centers and polarization within cities. In this study, we intend to apply big data and machine learning methodologies to predict depression symptoms in the elderly population early on, thus contributing to solving the problem of elderly depression. Research design, data and methodology: Machine learning techniques used random forest and analyzed the correlation between CES-D10 and other variables, which are widely used worldwide, to estimate important variables. Dependent variables were set up as two variables that distinguish normal/depression from moderate/severe depression, and a total of 106 independent variables were included, including subjective health conditions, cognitive abilities, and daily life quality surveys, as well as the objective characteristics of the elderly as well as the subjective health, health, employment, household background, income, consumption, assets, subjective expectations, and quality of life surveys. Results: Studies have shown that satisfaction with residential areas and quality of life and cognitive ability scores have important effects in classifying elderly depression, satisfaction with living quality and economic conditions, and number of outpatient care in living areas and clinics have been important variables. In addition, the results of a random forest performance evaluation, the accuracy of classification model that classify whether elderly depression or not was 86.3%, the sensitivity 79.5%, and the specificity 93.3%. And the accuracy of classification model the degree of elderly depression was 86.1%, sensitivity 93.9% and specificity 74.7%. Conclusions: In this study, the important variables of the estimated predictive model were identified using the random forest technique and the study was conducted with a focus on the predictive performance itself. Although there are limitations in research, such as the lack of clear criteria for the classification of depression levels and the failure to reflect variables other than KLoSA data, it is expected that if additional variables are secured in the future and high-performance predictive models are estimated and utilized through various machine learning techniques, it will be able to consider ways to improve the quality of life of senior citizens through early detection of depression and thus help them make public policy decisions.

Assessment of Sensitivity of Photo-Chromosomal Assay in the Prediction of Photo-carcinogenicity (광염색체이상시험의 광발암성 예측능력에 대한 평가)

  • Hong Mi-Young;Kim Ji-Young;Lee Young Mi;Lee Michael
    • Toxicological Research
    • /
    • v.21 no.2
    • /
    • pp.99-105
    • /
    • 2005
  • Photo-mutagenic compounds have been known to alter skin cancer rates by acting as initiators or by affecting subsequent steps in carcinogenesis. The objectives of this study are to investigate the utility of photo-chromosomal aberration (photo-CA) assay for detecting photo-clastogens, and to evaluate its ability to predict rodent photocarcinogenicity. Photo-CA assay was performed with five test substances that demonstrated positive results in photo-carcinogenicity tests: 8-Methoxypsoralen (photoactive substance that forms DNA adducts in the presence of ultraviolet A irradiation), chlorpromazine (an aliphatic phenothiazine an alpha-adrenergic blocking agent), lomefloxacin (an antibiotic in a class of drugs called fluoroquinolones), anthracene (a tricyclic aromatic hydrocarbon a basic substance for production of anthraquinone, dyes, pigments, insecticides, wood preservatives and coating materials) and Retinoic acid (a retinoid compound closely related to vitamin A). For the best discrimination between the test substance-mediated genotoxicity and the undesirable genotoxicity caused by direct DNA absorption, a UV dose-response of the cells in the absence of the test substances was firstly analyzed. All 5 test substances showed a positive outcome in photo-CA assay, indicating that the photo-CA test is very sensitive to the photo-genotoxic effect of UV irradiation. With this limited data-set, an investigation into the predictive value of this photo-CA test for determining the photo-carcinogenicity showed that photo-CA assay has the high ability of a test to predict carcinogenicity. Therefore, the photo-CA test using mammalian cells seems to be a sensitive method to evaluate the photo-carcinogenic potential of new compounds.

Systemic Inflammation Response Syndrome Score Predicts the Mortality in Multiple Trauma Patients

  • Baek, Jong Hyun;Kim, Myeong Su;Lee, Jung Cheul;Lee, Jang Hoon
    • Journal of Chest Surgery
    • /
    • v.47 no.6
    • /
    • pp.523-528
    • /
    • 2014
  • Background: Numerous statistical models have been developed to accurately predict outcomes in multiple trauma patients. However, such trauma scoring systems reflect the patient's physiological condition, which can only be determined to a limited extent, and are difficult to use when performing a rapid initial assessment. We studied the predictive ability of the systemic inflammatory response syndrome (SIRS) score compared to other scoring systems. Methods: We retrospectively reviewed 229 patients with multiple trauma combined with chest injury from January 2006 to June 2011. A SIRS score was calculated for patients based on their presentation to the emergency room. The patients were divided into two groups: those with an SIRS score of two points or above and those with an SIRS score of one or zero. Then, the outcomes between the two groups were compared. Furthermore, the ability of the SIRS score and other injury severity scoring systems to predict mortality was compared. Results: Hospital death occurred in 12 patients (5.2%). There were no significant differences in the general characteristics of patients, but the trauma severity scores were significantly different between the two groups. The SIRS scores, number of complications, and mortality rate were significantly higher in those with a SIRS score of two or above (p<0.001). In the multivariant analysis, the SIRS score was the only independent factor related to mortality. Conclusion: The SIRS score is easily calculated on admission and may accurately predict mortality in patients with multiple traumas.