• Title/Summary/Keyword: Categorical Variables

Search Result 217, Processing Time 0.031 seconds

A Method for Reduction of Categorical Variables Based on a Concept of Pseudo-Correlation Coefficient (유사상관계수의 개념을 도입한 범주형 변수의 축약에 관한 연구)

  • Kwon, Cheol-Shin;Hong, Soon-Wook
    • IE interfaces
    • /
    • v.14 no.1
    • /
    • pp.79-83
    • /
    • 2001
  • In this paper, we propose a simple method to reduce categorical variables into smaller, but significant numbers, and also demonstrate how the proposed method can be applied to the problem of reduction that empirical research often faces in the course of data processing. For the purpose, we introduce a concept of pseudo-correlation coefficient to make it possible to use factor analysis (FA) as a tool for reducing variables. The main idea of the concept is to deal with the measures of association of categorical variables in the sense of the concept of Pearson's correlation coefficient in order to meet the input requirement of FA. Upon examination of existing measures that could play as pseudo-correlation coefficients, Cramer's V coefficient is selected for the best result among them. To show the detailed procedure of the proposed method, a specific demonstration with the data from 329 R&D projects conducted in 18 private laboratories in electric and electronics industry is presented.

  • PDF

Parallel Coordinate Plots of Mixed-Type Data

  • Kwak, Il-Youp;Huh, Myung-Hoe
    • Communications for Statistical Applications and Methods
    • /
    • v.15 no.4
    • /
    • pp.587-595
    • /
    • 2008
  • Parallel coordinate plot of Inselberg (1985) is useful for visualizing dozens of variables, but so far the plot's applicability is limited to the variables of numerical type. The aim of this study is to extend the parallel coordinate plot so that it can accommodate both numerical and categorical variables. We combine Hayashi's (1950, 1952) quantification method of categorical variables and Hurley's (2004) endlink algorithm of ordering variables for the parallel coordinate plot. In line with our former study (Kwak and Huh, 2008), we develop Andrews' type modification of conventional straight-lines parallel coordinate plot to visualize the mixed-type data.

A Comparative Analysis of Risk Assessment Models for Asbestos Demolition (석면 해체 작업의 위험성평가모델 비교 분석)

  • Kim, Dong-Gyu;Kim, Min-Seung;Lee, Su-Min;Kim, Yu-Jin;Han, Seung-Woo
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.11a
    • /
    • pp.99-100
    • /
    • 2022
  • As the danger of exposure to the asbestos has been revealed, the importance of demolition asbestos in existing buildings has been raised. Extensive body of study has been conducted to evaluate the risk of demolition asbestos, but there were confined types of variables caused by not reflecting categorical information and limitations in collecting quantitative information. Thus, this study aims to derive a model that predicts the risk in workplace of demolition asbestos by collecting categorical and continuous variables. For this purpose, categorical and continuous variables were collected from asbestos demolition reports, and the risk assessment score was set as the dependent variable. In this study, the influence of each variable was identified using logistic regression, and the risk prediction model methodologies were compared through decision tree regression and artificial neural network. As a result, a conditional risk prediction model was derived to evaluate the risk of demolition asbestos, and this model is expected to be used to ensure the safety of asbestos demolition workers.

  • PDF

A polychotomous regression model with tensor product splines and direct sums (연속형의 텐서곱과 범주형의 직합을 사용한 다항 로지스틱 회귀모형)

  • Sim, Songyong;Kang, Heemo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.19-26
    • /
    • 2014
  • In this paper, we propose a polychotomous regression model when independent variables include both categorical and numerical variables. For categorical independent variables, we use direct sums, and tensor product splines are used for continuous independent variables. We use BIC for varible selections criterior. We implemented the algorithm and apply the algorithm to real data. The use of direct sums and tensor products outperformed the usual multinomial logistic regression model.

Case Studies on the Optimal Parameter Design with Respect to Categorial Characteristics (범주형 품질특성의 최적설계 사례연구)

  • Park, Jong-In;Bae, Suk-Joo;Kim, Man-Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.32 no.3
    • /
    • pp.135-141
    • /
    • 2009
  • A variety of statistical methods are applied to model and optimize responses, related to product or system's quality, in terms of control and noise factors at design and manufacturing stages. Most of them assume continuous response variables but, assessing the performance of a product or system often involves categorical observations, such as ratings and scores. Although most previous works to deal with the categorical data provide sorhisticated response models and ensure unbiased outcomes, they require heavy computation to estimate the model parameters, as well as enough replications. In this study, we present some practical approaches for optimal parameter design with ordered categorical response when only a few or no replication is available. Two real-life examples are given to illustrate the presented methods.

Korean Nurses과 Nursing Role Conceptions and Professional Commitment (간호사의 역할개념 양상과 간호직에 대한 헌신몰입에 관한 연구)

  • 이상미
    • Journal of Korean Academy of Nursing
    • /
    • v.21 no.3
    • /
    • pp.307-322
    • /
    • 1991
  • The purpose of this exploratory study was to analyze nursing role conceptions and test the relationships between nursing role conceptions and professional commitment among selected Korean nurses. Data were obtained from a convenience sample of 262 practising nurses of varying positions, education, and experience. The total sample represents a response rate of 93 percent. Subscales of Nursing Role Conceptions (Pieta, 1976) were used to measure professional, service, and bureaucratic role conceptions 1 the tool to measure professional commitment was developed by the investigator. The results of this study were as follows. 1. Professional role conception and service role conception were positively related(normative r= .61 : categorical r= .64). Bureaucratic role conception scores(32.6$\pm$4.97) were higher than professional and service role conception scores. 2. Experience was positively related to bureaucratic professional categorical role conception(r= .17, p< .01), and negatively related to bureaucratic professional role discrepancy(r=- .12, p< .01). There was no relationship between experience and service role conception. This study also showed that nurses who had longer experience tended to have higher role conceptions on all three subscales. 3. Nurses with a master's degree had significantly higher professional and bureaucratic role conceptions scores. Bacealaureates graduates had the lowest bureaucratic categorical role conception scores ; associate nurses had the lowest professional categorical role conception scores. 4. Nursing supervisors and head nurses had significantly higher bureaucratic categorical role coneption scores, whereas they had lower bureaucratic normative and professional role conception scores. 5. Age and experience were positively related to professional commitment (r= .24, r= .28). Hierarchical multiple regression analyses showed that the combination of nursing role conceptions explained greater variance in professional commitment pair of the variables alone. Further research employing dynamic designs is needed to execute rigorous tests of causal models of nursing role conceptions and professional commitment. The findings of this study suggest that antecedents and moderating variables of nursing role conception and professional commitment need to be explored for further theoretical. specification and empirical evaluation.

  • PDF

A multivariate latent class profile analysis for longitudinal data with a latent group variable

  • Lee, Jung Wun;Chung, Hwan
    • Communications for Statistical Applications and Methods
    • /
    • v.27 no.1
    • /
    • pp.15-35
    • /
    • 2020
  • In research on behavioral studies, significant attention has been paid to the stage-sequential process for multiple latent class variables. We now explore the stage-sequential process of multiple latent class variables using the multivariate latent class profile analysis (MLCPA). A latent profile variable, representing the stage-sequential process in MLCPA, is formed by a set of repeatedly measured categorical response variables. This paper proposes the extended MLCPA in order to explain an association between the latent profile variable and the latent group variable as a form of a two-dimensional contingency table. We applied the extended MLCPA to the National Longitudinal Survey on Youth 1997 (NLSY97) data to investigate the association between of developmental progression of depression and substance use behaviors among adolescents who experienced Authoritarian parental styles in their youth.

An Improvement on Estimation for Causal Models of Categorical Variables of Abilities and Task Performance

  • Kim, Sung-Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.65-86
    • /
    • 2000
  • The estimates from an EM when it is applied to a large causal model of 10 or more categorical variables are often subject to the initial values for the estimates. This phenomenon becomes more serious as the model structure becomes more serious as the model structure becomes more complicated involving more variables. In this regard Wu(1983) recommends among others that EMs are implemented several times with different sets of initial values to obtain more appropriate estimates. in this paper a new approach for initial values is proposed. The main idea is that we use initials that are calibrated to data. A simulation result strongly indicates that the calibrated initials give rise to the estimates that are far closer to the true values than the initials that are not calibrated.

  • PDF

Performance Evaluation of Military Corps with Categorical Environmental Variables (범주형 환경변수를 고려한 부대성과평가 방법에 관한 연구 - DEA와 CCCA의 결합을 중심으로 -)

  • Lee, Kyung-Won;Park, Myung-Seop;Im, Jae-Poong
    • Journal of the military operations research society of Korea
    • /
    • v.32 no.1
    • /
    • pp.51-72
    • /
    • 2006
  • There are many occasions that the performance of a corps is influenced not only by its own efforts but by the commander of the next higher unit in a vertical organizational structure. When the direction of the commander in the next higher organization is different from that of the actual evaluation agency, the unit under evaluation may get rated lower than what it should deserve. This study suggests an alternative method to evaluate the performance of military units in the situation that there exist critical environmental factors which affect the performance. This method employes DEA, a non parametric method, and Constrained Canonical Correlation Analysis(CCCA), a parametric method which is used to estimate a efficient frontier with multiple dependent variables and constraints. This article also exploits a set of categorical environmental variables in the CCCA to improve the fairness of performance evaluation. It is shown that the introduction of the categorical variables helps evaluating the true performance of individual units such as battalions subordinated to different next higher commanders.

Bayesian Analysis for Categorical Data with Missing Traits Under a Multivariate Threshold Animal Model (다형질 Threshold 개체모형에서 Missing 기록을 포함한 이산형 자료에 대한 Bayesian 분석)

  • Lee, Deuk-Hwan
    • Journal of Animal Science and Technology
    • /
    • v.44 no.2
    • /
    • pp.151-164
    • /
    • 2002
  • Genetic variance and covariance components of the linear traits and the ordered categorical traits, that are usually observed as dichotomous or polychotomous outcomes, were simultaneously estimated in a multivariate threshold animal model with concepts of arbitrary underlying liability scales with Bayesian inference via Gibbs sampling algorithms. A multivariate threshold animal model in this study can be allowed in any combination of missing traits with assuming correlation among the traits considered. Gibbs sampling algorithms as a hierarchical Bayesian inference were used to get reliable point estimates to which marginal posterior means of parameters were assumed. Main point of this study is that the underlying values for the observations on the categorical traits sampled at previous round of iteration and the observations on the continuous traits can be considered to sample the underlying values for categorical data and continuous data with missing at current cycle (see appendix). This study also showed that the underlying variables for missing categorical data should be generated with taking into account for the correlated traits to satisfy the fully conditional posterior distributions of parameters although some of papers (Wang et al., 1997; VanTassell et al., 1998) presented that only the residual effects of missing traits were generated in same situation. In present study, Gibbs samplers for making the fully Bayesian inferences for unknown parameters of interests are played rolls with methodologies to enable the any combinations of the linear and categorical traits with missing observations. Moreover, two kinds of constraints to guarantee identifiability for the arbitrary underlying variables are shown with keeping the fully conditional posterior distributions of those parameters. Numerical example for a threshold animal model included the maternal and permanent environmental effects on a multiple ordered categorical trait as calving ease, a binary trait as non-return rate, and the other normally distributed trait, birth weight, is provided with simulation study.