• Title/Summary/Keyword: multinomial logistic analysis

Search Result 153, Processing Time 0.027 seconds

Two-stage imputation method to handle missing data for categorical response variable

  • Jong-Min Kim;Kee-Jae Lee;Seung-Joo Lee
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.6
    • /
    • pp.577-587
    • /
    • 2023
  • Conventional categorical data imputation techniques, such as mode imputation, often encounter issues related to overestimation. If the variable has too many categories, multinomial logistic regression imputation method may be impossible due to computational limitations. To rectify these limitations, we propose a two-stage imputation method. During the first stage, we utilize the Boruta variable selection method on the complete dataset to identify significant variables for the target categorical variable. Then, in the second stage, we use the important variables for the target categorical variable for logistic regression to impute missing data in binary variables, polytomous regression to impute missing data in categorical variables, and predictive mean matching to impute missing data in quantitative variables. Through analysis of both asymmetric and non-normal simulated and real data, we demonstrate that the two-stage imputation method outperforms imputation methods lacking variable selection, as evidenced by accuracy measures. During the analysis of real survey data, we also demonstrate that our suggested two-stage imputation method surpasses the current imputation approach in terms of accuracy.

Prediction on Busan's Gross Product and Employment of Major Industry with Logistic Regression and Machine Learning Model (로지스틱 회귀모형과 머신러닝 모형을 활용한 주요산업의 부산 지역총생산 및 고용 효과 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.47 no.2
    • /
    • pp.69-88
    • /
    • 2022
  • This paper aims to predict Busan's regional product and employment using the logistic regression models and machine learning models. The following are the main findings of the empirical analysis. First, the OLS regression model shows that the main industries such as electricity and electronics, machine and transport, and finance and insurance affect the Busan's income positively. Second, the binomial logistic regression models show that the Busan's strategic industries such as the future transport machinery, life-care, and smart marine industries contribute on the Busan's income in large order. Third, the multinomial logistic regression models show that the Korea's main industries such as the precise machinery, transport equipment, and machinery influence the Busan's economy positively. And Korea's exports and the depreciation can affect Busan's economy more positively at the higher employment level. Fourth, the voting ensemble model show the higher predictive power than artificial neural network model and support vector machine models. Furthermore, the gradient boosting model and the random forest show the higher predictive power than the voting model in large order.

Latent Classes of Depressive Symptom Trajectories of Adolescents and Determinants of Classes (청소년 우울 증상의 변화 궤적에 따른 잠재계층유형 및 영향요인)

  • Kim, Eunjoo
    • Research in Community and Public Health Nursing
    • /
    • v.33 no.3
    • /
    • pp.299-311
    • /
    • 2022
  • Purpose: Untreated depression in adolescents affects their entire life. It is important to detect and intervene early depression in adolescence considering the characteristics of adolescent's depressive symptoms accompanied by internalization and externalization. The aim of this study was to identify latent classes of depressive symptom trajectories of adolescents and determinants of classes in Korea. Methods: The three time-point (2018~2020) data derived from the Korean Children and Youth Panel Survey 2018 were used (N=2,325). Latent Growth Curve Modeling (LGCM) was conducted to explore the depressive symptom trajectories in all adolescents, and Latent Class Growth Modeling (LCGM) was conducted to identify each latent class. Multinomial logistic regression analysis was performed to confirm the determinants of each latent class. Results: The LGCM results showed that there was no statistically significant change in all adolescents' depressive symptoms for 3 years. However, the LCGM results showed that four latent classes showing different trajectories were distinguished: 1) Low-stable (intercept=14.39, non-significant slope), 2) moderate-increasing (intercept=19.62, significantly increasing slope), 3) high-stable (intercept=26.30, non-significant slope), and 4) high-rapidly decreasing (intercept=26.34, significantly rapidly decreasing slope). The multinomial logistic regression analysis showed that the significant determinants (i.e., gender, self-esteem, aggression, somatization, peer relationship) of each latent class were different. Conclusion: When screening adolescent's depression, it is necessary to monitor not only direct depression symptoms but also self-esteem, aggression, somatization symptoms, and peer relationships. The findings of this study may be valuable for nurses and policy makers to develop mental health programs for adolescents.

Association between oral health status and body mass index in older adults (노인의 구강건강상태와 체질량지수의 연관성)

  • Cho, Younyoung;Lee, Yunhwan;Kim, Jinhee
    • Journal of Korean society of Dental Hygiene
    • /
    • v.16 no.1
    • /
    • pp.129-136
    • /
    • 2016
  • Objectives: The purpose of the study is to investigate the relationship between oral health status and body mass index (BMI) in adults over 65 years old. Methods: The study subjects were 4,550 adults over 65 years old from the 5th Korea National Health and Nutrition Examination Survey(KNHANES V) in 2010-2012. Mastication-related oral health status included the number of remaining teeth, and mean number of decayed, missing, and filled permanent teeth(DMFT). Body mass index(BMI, $kg/m^2$) was categorized as underweight(<18.5), normal weight (18.5-22.9), overweight(23.0-24.9), and obese(${\geq}25.0$). Multinomial logistic regression analysis was performed to examine the association of BMI categories with the number of remaining teeth and DMFT. Results: The mean number of DMFT was highest($13.0{\pm}0.7$) in the underweight group and lowest($8.8{\pm}0.3$) in the obese group. Those having less favorable masticatory ability, and fewer number of remaining teeth and no prosthesis, tended to be underweight. Those having a higher number of remaining teeth and prosthetic teeth tended to be overweight or obese. In the multinomial logistic regression analysis, compared with those having 20 or more remaining teeth, including prosthetic teeth, those having less than 20 remaining teeth and no prosthesis had 4.48 times higher odds ratio of being underweight. DMFT was positively associated with underweight, while negatively associated with overweight or obesity. Conclusions: The masticatory ability and dental caries prevention maintained the healthy body weight in adults of old age.

Comparison of Determinants of Healthy Food Intake Before and After COVID-19 - Based on 2019~2021 Consumer Behavior Survey for Food - (COVID-19 전후 건강식품 섭취 여부 결정요인 비교 - 2019년~2021년 식품소비행태조사 자료 이용 -)

  • Su-yeon Jung;Na-young Kim;Eun-seo Jeon;Keum-il Jang;Seon-woong Kim
    • The Korean Journal of Food And Nutrition
    • /
    • v.36 no.4
    • /
    • pp.309-320
    • /
    • 2023
  • This study examined the determinants of healthy food purchases before and after COVID-19 in Korea. Binomial and multinomial logistic regression models were applied to Korea Rural Economic Institute's Food Consumer Behavior Survey data from 2019 to 2021. The analysis revealed a significant decrease in the non-intake of healthy food in 2021 compared to 2019, suggesting the impact of COVID-19 on healthy food consumption. Consumption patterns also changed, with a decrease in direct purchases and an increase in gift-based purchases. Several variables showed significant effects on healthy food intake. Single-person households exhibited a higher probability of eating healthy food after COVID-19. The group perceiving themselves as healthy had a lower likelihood of consuming healthy food pre-COVID-19, but this changed after the pandemic. Online food purchases, eco-friendly food purchases, and nut consumption showed a gradual decrease in the probability of non-intake over time. Gender and age also influenced healthy food intake. The probability of eating healthy food increased in the older age group compared to the younger group, and the probability increased significantly after COVID-19. The probability of buying gifts was significantly higher in those in their 60s, indicating that the path to obtaining healthy food differed by age.

Ranking subjects based on paired compositional data with application to age-related hearing loss subtyping

  • Nam, Jin Hyun;Khatiwada, Aastha;Matthews, Lois J.;Schulte, Bradley A.;Dubno, Judy R.;Chung, Dongjun
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.2
    • /
    • pp.225-239
    • /
    • 2020
  • Analysis approaches for single compositional data are well established; however, effective analysis strategies for paired compositional data remain to be investigated. The current project was motivated by studies of age-related hearing loss (presbyacusis), where subjects are classified into four audiometric phenotypes that need to be ranked within these phenotypes based on their paired compositional data. We address this challenge by formulating this problem as a classification problem and integrating a penalized multinomial logistic regression model with compositional data analysis approaches. We utilize Elastic Net for a penalty function, while considering average, absolute difference, and perturbation operators for compositional data. We applied the proposed approach to the presbyacusis study of 532 subjects with probabilities that each ear of a subject belongs to each of four presbyacusis subtypes. We further investigated the ranking of presbyacusis subjects using the proposed approach based on previous literature. The data analysis results indicate that the proposed approach is effective for ranking subjects based on paired compositional data.

Analysis of Research Topics and Trends on COVID-19 in Korea Using Latent Dirichlet Allocation (LDA)

  • Heo, Seong-Min;Yang, Ji-Yeon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.12
    • /
    • pp.83-91
    • /
    • 2020
  • This study aims to identify research topics and examine the trend of Covid19-related papers on DBpia. Applying latent Dirichlet allocation (LDA), we have extracted seven research topics, each of which concerns "International Dynamics", "Technology & Security", "Psychological Impact", "Biomedical-Related", "Economic Impact", "Online Education", and "Religion-Related". In addition, we used the multinomial logistic model to examine the trend of research topics. We found that the papers mainly cover topics related to "International Dynamics" and "Biomedical-Related" before June 2020, but the topics have become diverse since then. In particular, topics regarding "Economic Impact", "Online Education" and "Psychological Impact" has drawn increased attention of researchers. The findings would provide a guideline for collaboration in Covid19-related research, and could serve as a reference work for active research.

Analysis of Determinants of Home Meal Replacement Purchase Frequency before and after COVID-19 based on a Consumer Behavior Survey (COVID-19 전후 소비자의 간편식 구입 빈도 결정 요인 비교)

  • Oh, Young-jin;Jang, Keum-il;Kim, Seon-woong
    • The Korean Journal of Food And Nutrition
    • /
    • v.34 no.6
    • /
    • pp.576-583
    • /
    • 2021
  • The purpose of this study was to estimate the influence of the determinants for home meal replacement (HMR) purchase frequency before and after COVID-19. Multinomial logistic regression was applied to the 2018~2020 Consumer Behavior Survey for Food data from the Korea Rural Economic Institute (KREI). Gender, age, number of households, monthly income, use of eating out, delivery and takeout order service, HMR food safety concern, the frequency of cooking at home, grocery shopping, and eating alone were applied as the explanatory variables to explain HMR purchase frequency. The results are as below. Compared to the previous year, the growth rate of HMR purchase frequency in 2020 was relatively high, indicating that the COVID-19 outbreak acted as a catalyst. Unlike in 2018 and 2019, there was no statistical difference in the HMR purchase frequency between single- and multi-person households in 2020, with indicating multi-person households began to emerge as one of the major HMR consumption groups. Unlike 2018, the 2020 HMR purchase frequency showed a statistically positive relationship with those of grocery shopping and eating alone. There was a positive relationship between the frequency of eating out/food delivery orders and HMR purchases. The more often cooking at home occurred, the less HMR food was purchased.

Differences in Time Use Satisfaction by Time Allocation Types of the Elderly (노인의 시간배분 유형에 따른 시간사용만족도의 차이)

  • Kim, Oi-Sook
    • Journal of Family Resource Management and Policy Review
    • /
    • v.19 no.1
    • /
    • pp.163-180
    • /
    • 2015
  • The purpose of this study was to explore a typology of time allocation, investigate determinants of time allocation types, and analyze differences in time use satisfaction by the types of time use of the elderly. The data source for this research was the 2009 Time Use Survey conducted by the Korea National Statistical office (KNSO). The 4,699 time diaries (3,552 for weekday, 1,147 for Sunday) completed by the elderly over the age of 60 were analyzed using mean, standard deviation, chi-square, cluster analysis, ANOVA analysis, Duncan test, and multinomial logistic regression analysis. Time allocation of the elderly was classified into four types: personal care oriented, work oriented, leisure oriented, and balanced type. Gender, age, education, employment status, income, and the presence of spouse were identified as determinants for each type. According to the types of time allocation, time use satisfaction was different on week days.

Factors Affecting the Smoking Type Experience of Korean Adolescents (우리나라 청소년들의 흡연유형 경험 영향요인)

  • Bin, Sung-Oh
    • The Journal of Korean Society for School & Community Health Education
    • /
    • v.23 no.2
    • /
    • pp.65-76
    • /
    • 2022
  • Objectives: The purpose of this study is to investigate the factors that affect the smoking type among those who have used regular cigarettes, liquid or cigarette-type e-cigarettes. Methods: The subjects of analysis were 6,081 people who had smoked regular cigarettes or e-cigarettes. For data analysis, SPSS ver.25.0 statistical package program was used. Multinomial logistic regression analysis was performed to find out the factors affecting smoking type. Results: Factors affecting the experience of using e-cigarettes compared to regular cigarette smoking are gender and class. Academic performance, living with family members, drinking experience, and secondhand smoke in school. The factors influencing dual use compared to regular cigarette smoking were gender, class, academic performance, economic status, living with family, drinking experience, and experience of secondhand smoke in school. Smoking cessation attempts had an effect on dual use compared to regular cigarette smoking. Conclusion: Smoking cessation experience had a greater effect on e-cigarette use than regular cigarette smoking.