• Title/Summary/Keyword: Categorical Information

Search Result 219, Processing Time 0.022 seconds

Investigation of Biases for Variance Components on Multiple Traits with Varying Number of Categories in Threshold Models Using Bayesian Inferences

  • Lee, D.H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.15 no.7
    • /
    • pp.925-931
    • /
    • 2002
  • Gibbs sampling algorithms were implemented to the multi-trait threshold animal models with any combinations of multiple binary, ordered categorical, and linear traits and investigate the amount of bias on these models with two kinds of parameterization and algorithms for generating underlying liabilities. Statistical models which included additive genetic and residual effects as random and contemporary group effects as fixed were considered on the models using simulated data. The fully conditional posterior means of heritabilities and genetic (residual) correlations were calculated from 1,000 samples retained every 10th samples after 15,000 samples discarded as "burn-in" period. Under the models considered, several combinations of three traits with binary, multiple ordered categories, and continuous were analyzed. Five replicates were carried out. Estimates for heritabilities and genetic (residual) correlations as the posterior means were unbiased when underlying liabilities for a categorical trait were generated given by underlying liabilities of the other traits and threshold estimates were rescaled. Otherwise, when parameterizing threshold of zero and residual variance of one for binary traits, heritability estimates were inflated 7-10% upward. Genetic correlation estimates were biased upward if positively correlated and downward if negatively correlated when underling liabilities were generated without accounting for correlated traits on prior information. Residual correlation estimates were, consequently, much biased downward if positively correlated and upward if negatively correlated in that case. The more categorical trait had categories, the better mixing rate was shown.

Imputation for Binary or Ordered Categorical Traits Based on the Bayesian Threshold Model (베이지안 분계점 모형에 의한 순서 범주형 변수의 대체)

  • Lee Seung-Chun
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.597-606
    • /
    • 2005
  • The nonresponse in sample survey causes a problem when it comes time to analyze dataset in public-use files where the user has only complete-data methods available and has limited information about the reasons for nonresponse. Recently imputation for nonresponse is becoming a standard approach for handling nonresponse and various imputation methods have been devised . However, most imputation methods concern with continuous traits while many interesting features are measured by binary or ordered categorical scales in sample survey. In this note. an imputation method for ignorable nonresponse in binary or ordered categorical traits is considered.

Clustering method for similar user with Miexed Data in SNS

  • Song, Hyoung-Min;Lee, Sang-Joon;Kwak, Ho-Young
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.11
    • /
    • pp.25-30
    • /
    • 2015
  • The enormous increase of data with the development of the information technology make internet users to be hard to find suitable information tailored to their needs. In the face of changing environment, the information filtering method, which provide sorted-out information to users, is becoming important. The data on the internet exists as various type. However, similarity calculation algorithm frequently used in existing collaborative filtering method is tend to be suitable to the numeric data. In addition, in the case of the categorical data, it shows the extreme similarity like Boolean Algebra. In this paper, We get the similarity in SNS user's information which consist of the mixed data using the Gower's similarity coefficient. And we suggest a method that is softer than radical expression such as 0 or 1 in categorical data. The clustering method using this algorithm can be utilized in SNS or various recommendation system.

Contour Plot to Explore the Structure of Categorical Data

  • Kim, Hyun Chul;Huh, Moon Yul;Chung, Hee Suk
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.371-385
    • /
    • 2003
  • In this paper, contour plot is considered as a method to explore the structure of categorical data. For this purpose, the paper suggests a method to sort two-way contingency table with respect to the expected marginals. It is found that the suggested plot provides us with valuable information for the underlying data structure. Firstly, we can investigate independency between the categories by examining the differences of expected frequency contours and observed frequency contours. With the plot, we can also visually investigate the existence of outliers inherent in the data. These properties of the suggested contour plot will be demonstrated by several sets of real data.

Categorical Analysis for Finite Cellular Automata Rule 15 (유한 셀룰러 오토마타 규칙 15에 대한 카테고리적 분석)

  • Park, Jung-Hee;Lee, Hyen-Yeal
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.27 no.8
    • /
    • pp.752-757
    • /
    • 2000
  • The recursive formulae, which can self-reproduce the state transition graphs, of one-dimensional cellular automata rule 15 with two states (0 and 1) and four different boundary conditions were founded by categorical access. The categorical access makes the evolution process for cellular automata be expressed easily since it enables the mapping of automata between different domains.

  • PDF

The Content Structure of the Navigation Course Using Learning Hierarchy (학습위계에 의한 항해교과의 내용 구조화)

  • Yoon, Hyun-Sang
    • Journal of Fisheries and Marine Sciences Education
    • /
    • v.6 no.2
    • /
    • pp.198-216
    • /
    • 1994
  • The problem of promoting instructional effect using reorganizing the content of textbook is one of the major concerns of many education theorists and teachers. The results of many researches about above problem reveal that reorganizing the content of textbook promotes the ability of recall and problem solving of learners. The content structure of current navigation textbook revealed a categorical structure as its basic framework, though it seems to be a poor one. A categorical structure is known as providing an inferior information processing mechanism for learners than a learning hierarchy content structure is. Furthermore current content structure hasn't given any considerations to navigation in practice, spatial contexts and sequential events of ships from a harbor to another harbor. The learning hierarchy content structure has an advantage of giving learners more systematic and stronger knowledge networks than a categorical structure.

  • PDF

GOODNESS OF FIT TESTS BASED ON DIVERGENCE MEASURES

  • Pasha, Eynollah;Kokabi, Mohsen;Mohtashami, Gholam Reza
    • Journal of applied mathematics & informatics
    • /
    • v.26 no.1_2
    • /
    • pp.177-189
    • /
    • 2008
  • In this paper, we have considered an investigation on goodness of fit tests based on divergence measures. In the case of categorical data, under certain regularity conditions, we obtained asymptotic distribution of these tests. Also, we have proposed a modified test that improves the rate of convergence. In continuous case, we used our modified entropy estimator [10], for Kullback-Leibler information estimation. A comparative study based on simulation results is discussed also.

  • PDF

A Comparative Analysis of Risk Assessment Models for Asbestos Demolition (석면 해체 작업의 위험성평가모델 비교 분석)

  • Kim, Dong-Gyu;Kim, Min-Seung;Lee, Su-Min;Kim, Yu-Jin;Han, Seung-Woo
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2022.11a
    • /
    • pp.99-100
    • /
    • 2022
  • As the danger of exposure to the asbestos has been revealed, the importance of demolition asbestos in existing buildings has been raised. Extensive body of study has been conducted to evaluate the risk of demolition asbestos, but there were confined types of variables caused by not reflecting categorical information and limitations in collecting quantitative information. Thus, this study aims to derive a model that predicts the risk in workplace of demolition asbestos by collecting categorical and continuous variables. For this purpose, categorical and continuous variables were collected from asbestos demolition reports, and the risk assessment score was set as the dependent variable. In this study, the influence of each variable was identified using logistic regression, and the risk prediction model methodologies were compared through decision tree regression and artificial neural network. As a result, a conditional risk prediction model was derived to evaluate the risk of demolition asbestos, and this model is expected to be used to ensure the safety of asbestos demolition workers.

  • PDF

Latent class model for mixed variables with applications to text data (혼합모드 잠재범주모형을 통한 텍스트 자료의 분석)

  • Shin, Hyun Soo;Seo, Byungtae
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.6
    • /
    • pp.837-849
    • /
    • 2019
  • Latent class models (LCM) are useful tools to draw hidden information from categorical data. This model can also be interpreted as a mixture model with multinomial component distributions. In some cases, however, an available dataset may contain both categorical and count or continuous data. For such cases, we can extend the LCM to a mixture model with both multinomial and other component distributions such as normal and Poisson distributions. In this paper, we consider a LCM for the data containing categorical and count data to analyze the Drug Review dataset which contains categorical responses and text review. From this data analysis, we show that we can obtain more specific hidden inforamtion than those from the LCM only with categorical responses.