• Title/Summary/Keyword: 범주형 자료분석

Search Result 176, Processing Time 0.021 seconds

Computing Algorithm for Genetic Evaluations on Several Linear and Categorical Traits in A Multivariate Threshold Animal Model (범주형 자료를 포함한 다형질 임계개체모형에서 유전능력 추정 알고리즘)

  • Lee, D.H.
    • Journal of Animal Science and Technology
    • /
    • v.46 no.2
    • /
    • pp.137-144
    • /
    • 2004
  • Algorithms for estimating breeding values on several categorical data by using latent variables with threshold conception were developed and showed. Thresholds on each categorical trait were estimated by Newton’s method via gradients and Hessian matrix. This algorithm was developed by way of expansion of bivariate analysis provided by Quaas(2001). Breeding values on latent variables of categorical traits and observations on linear traits were estimated by preconditioned conjugate gradient(PCG) method, which was known having a property of fast convergence. Example was shown by simulated data with two linear traits and a categorical trait with four categories(CE=calving ease) and a dichotomous trait(SB=Still Birth) in threshold animal mixed model(TAMM). Breeding value estimates in TAMM were compared to those in linear animal mixed model (LAMM). As results, correlation estimates of breeding values to parameters were 0.91${\sim}$0.92 on CE and 0.87${\sim}$0.89 on SB in TAMM and 0.72~0.84 on CE and 0.59~0.70 on SB in LAMM. As conclusion, PCG method for estimating breeding values on several categorical traits with linear traits were feasible in TAMM.

A Sequence of Models for Categorical Data with Compound Scales (복합척도의 범주형 자료에 대한 연속 모형)

  • 최재성
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.103-110
    • /
    • 2001
  • This paper considers a multistage experiment. Response scales can be same or different from stage to stage. When variables are of nested structure, the response variable at each stage can be defined conditionally. For analysing such data with compound scales, this paper suggests a sequnce of dependence models and shows how to set up a sequence of models for the driver's liscense test data.

  • PDF

지분구조의 다가자료에 관한 모형

  • 최재성
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.2
    • /
    • pp.377-384
    • /
    • 1997
  • 본 논문은 지분구조를 갖는 범주형 자료가 명목상의 다가자료일 때, 지분구조의 각 단계에서 정의될 수 있는 지분변수들의 유형과 지분변수들의 관심확률들에 영향을 미치는 변수들을 고려한 자료분석 모형들을 제시하고 있다.

  • PDF

An Analysis of Categorical Time Series Driven by Clipping GARCH Processes (연속형-GARCH 시계열의 범주형화(Clipping)를 통한 분석)

  • Choi, M.S.;Baek, J.S.;Hwan, S.Y.
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.4
    • /
    • pp.683-692
    • /
    • 2010
  • This short article is concerned with a categorical time series obtained after clipping a heteroscedastic GARCH process. Estimation methods are discussed for the model parameters appearing both in the original process and in the resulting binary time series from a clipping (cf. Zhen and Basawa, 2009). Assuming AR-GARCH model for heteroscedastic time series, three data sets from Korean stock market are analyzed and illustrated with applications to calculating certain probabilities associated with the AR-GARCH process.

Categorical time series clustering: Case study of Korean pro-baseball data (범주형 시계열 자료의 군집화: 프로야구 자료의 사례 연구)

  • Pak, Ro Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.621-627
    • /
    • 2016
  • A certain professional baseball team tends to be very weak against another particular team. For example, S team, the strongest team in Korea, is relatively weak to H team. In this paper, we carried out clustering the Korean baseball teams based on the records against the team S to investigate whether the pattern of the record of the team H is different from those of the other teams. The technique we have employed is 'time series clustering', or more specifically 'categorical time series clustering'. Three methods have been considered in this paper: (i) distance based method, (ii) genetic sequencing method and (iii) periodogram method. Each method has its own advantages and disadvantages to handle categorical time series, so that it is recommended to draw conclusion by considering the results from the above three methods altogether in a comprehensive manner.

Analysis of categorical data with nonresponses (무응답을 포함하는 범주형 자료의 분석)

  • 박태성;이승연
    • The Korean Journal of Applied Statistics
    • /
    • v.11 no.1
    • /
    • pp.83-95
    • /
    • 1998
  • Statistical models are proposed for analyzing categorical data in the presence of missing observations or nonresponses which might occur in the sampling surveys and polls. As an illustration, we analyzed real polling data of the pre-presidential election in the USA, 1948, It had been predicted that Dewey would win the election. However, Truman won in the actual election.

  • PDF

Empirical Bayesian Misclassification Analysis on Categorical Data (범주형 자료에서 경험적 베이지안 오분류 분석)

  • 임한승;홍종선;서문섭
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.39-57
    • /
    • 2001
  • Categorical data has sometimes misclassification errors. If this data will be analyzed, then estimated cell probabilities could be biased and the standard Pearson X2 tests may have inflated true type I error rates. On the other hand, if we regard wellclassified data with misclassified one, then we might spend lots of cost and time on adjustment of misclassification. It is a necessary and important step to ask whether categorical data is misclassified before analyzing data. In this paper, when data is misclassified at one of two variables for two-dimensional contingency table and marginal sums of a well-classified variable are fixed. We explore to partition marginal sums into each cells via the concepts of Bound and Collapse of Sebastiani and Ramoni (1997). The double sampling scheme (Tenenbein 1970) is used to obtain informations of misclassification. We propose test statistics in order to solve misclassification problems and examine behaviors of the statistics by simulation studies.

  • PDF

Maximum Trimmed Likelihood Estimator for Categorical Data Analysis (범주형 자료분석을 위한 최대절사우도추정)

  • Choi, Hyun-Jip
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.2
    • /
    • pp.229-238
    • /
    • 2009
  • We propose a simple algorithm for obtaining MTL(maximum trimmed likelihood) estimates. The algorithm finds the subset to use to obtain the global maximum in the series of eliminating process which depends on the likelihood of cells in a contingency table. To evaluate the performance of the algorithm for MTL estimators, we conducted simulation studies. The results showed that the algorithm is very competitive in terms of computational burdens required to get the same or the similar results in comparison with the complete enumeration.

Small Sample Characteristics of Generalized Estimating Equations for Categorical Repeated Measurements (범주형 반복측정자료를 위한 일반화 추정방정식의 소표본 특성)

  • 김동욱;김재직
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.2
    • /
    • pp.297-310
    • /
    • 2002
  • Liang and Zeger proposed generalized estimating equations(GEE) for analyzing repeated data which is discrete or continuous. GEE model can be extended to model for repeated categorical data and its estimator has asymptotic multivariate normal distribution in large sample sizes. But GEE is based on large sample asymptotic theory. In this paper, we study the properties of GEE estimators for repeated ordinal data in small sample sizes. We generate ordinal repeated measurements for two groups using two methods. Through Monte Carlo simulation studies we investigate the empirical type 1 error rates, powers, relative efficiencies of the GEE estimators, the effect of unequal sample size of two groups, and the performance of variance estimators for polytomous ordinal response variables, especially in small sample sizes.

확률론적 공간 자료 통합 모델을 이용한 산사태 취약성 분석

  • Park, No-Uk;Ji, Gwang-Hun;Gwon, Byeong-Du
    • 한국지구과학회:학술대회논문집
    • /
    • 2005.02a
    • /
    • pp.254-260
    • /
    • 2005
  • 이 논문에서는 산사태 취약성 분석을 목적으로 확률론적 공간통합의 틀 안에서 범주형 자료와 연속형 자료를 효율적으로 처리할 수 있는 비모수적 우도비 추정 모델과 모수적 예측적 판별 분석 모델을 적용하였다. 적용 모델의 비교를 위해 1998년 여름철 산사태로 많은 피해를 입은 경기도 장흥 지역과 충청북도 보은 지역을 대상으로 사례연구를 수행하였다. 장흥 지역에서는 두 모델이 유사한 예측 능력을 나타내었으나, 보은 지역에서는 모수적 예측적 판별 분석 모델이 상대적으로 높은 예측 능력을 나타내었다. 결론적으로 제안한 두 모델은 산사태 취약성 분석을 위한 연속형 자료 표현에 효율적으로 적용될 수 있으며, 두 모델이 개별적인 연속형 자료 표현의 특성을 가지고 있기 때문에 다른 사례 연구를 통한 검증 작업이 병행되어야 할 것으로 생각된다.

  • PDF