• Title/Summary/Keyword: 범주형자료분석

Search Result 176, Processing Time 0.027 seconds

Steal Success Model for 2007 Korean Professional Baseball Games (2007년 한국프로야구에서 도루성공모형)

  • Hong, Chong-Sun;Choi, Jeong-Min
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.455-468
    • /
    • 2008
  • Based on the huge baseball game records, the steal plays an important role to affect the result of games. For the research about success or failure of the steal in baseball games, logistic regression models are developed based on 2007 Korean professional baseball games. The analyses of logistic regression models are compared of those of the discriminant models. It is found that the performance of the logistic regression analysis is more efficient than that of the discriminant analysis. Also, we consider an alternative logistic regression model based on categorical data which are transformed from uneasy obtainable continuous data.

확률론적 공간 자료 통합 모델을 이용한 산사태 취약성 분석

  • Park, No-Uk;Ji, Gwang-Hun;Gwon, Byeong-Du
    • 한국지구과학회:학술대회논문집
    • /
    • 2005.02a
    • /
    • pp.254-260
    • /
    • 2005
  • 이 논문에서는 산사태 취약성 분석을 목적으로 확률론적 공간통합의 틀 안에서 범주형 자료와 연속형 자료를 효율적으로 처리할 수 있는 비모수적 우도비 추정 모델과 모수적 예측적 판별 분석 모델을 적용하였다. 적용 모델의 비교를 위해 1998년 여름철 산사태로 많은 피해를 입은 경기도 장흥 지역과 충청북도 보은 지역을 대상으로 사례연구를 수행하였다. 장흥 지역에서는 두 모델이 유사한 예측 능력을 나타내었으나, 보은 지역에서는 모수적 예측적 판별 분석 모델이 상대적으로 높은 예측 능력을 나타내었다. 결론적으로 제안한 두 모델은 산사태 취약성 분석을 위한 연속형 자료 표현에 효율적으로 적용될 수 있으며, 두 모델이 개별적인 연속형 자료 표현의 특성을 가지고 있기 때문에 다른 사례 연구를 통한 검증 작업이 병행되어야 할 것으로 생각된다.

  • PDF

Maximum Trimmed Likelihood Estimator for Categorical Data Analysis (범주형 자료분석을 위한 최대절사우도추정)

  • Choi, Hyun-Jip
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.2
    • /
    • pp.229-238
    • /
    • 2009
  • We propose a simple algorithm for obtaining MTL(maximum trimmed likelihood) estimates. The algorithm finds the subset to use to obtain the global maximum in the series of eliminating process which depends on the likelihood of cells in a contingency table. To evaluate the performance of the algorithm for MTL estimators, we conducted simulation studies. The results showed that the algorithm is very competitive in terms of computational burdens required to get the same or the similar results in comparison with the complete enumeration.

범주형 자료에서 연관성 측도들의 비교 분석

  • 홍종선;임한승
    • Communications for Statistical Applications and Methods
    • /
    • v.4 no.3
    • /
    • pp.645-661
    • /
    • 1997
  • 연속형 변수들의 상관관계와 범주형 변수들의 연관성 측도들을 비교 연구하였다. 이 연구를 위하여 연속형 변수들이며 +1에서 -1까지 완벽한 상관관계를 갖고 있는 2 변량 정규분포를 이용하여 2$\times$2 분할표와 확장하여 일반적인 I$\times$J 분할표를 대신하는 3$\times$3 분할표를 생성하였다. 2 차원 분할표에서 정의된 연관성 측도들을 구하여 논의하였는데 2$\times$2 분할표에서는 교차적비 $\alpha$ 통계량과 교차적비의 함수로 표현되는 Yule [1912]의 Q와 Y의 통계량 그리고 상관계수 R 통계량과 R 통계량의 함수인 P 통계량을 설명하고 생성된 분할표에서 구한 통계량값을 분석하였으며, 3$\times$3 분할표에서는 Pearson의 독립성 검정통계량 $X^2$의 함수로 표현되는 P. T. V 통계량과 Goodman과 Kruskal [1954]의 $\lambda_{C/R}$통계량과 Light와 Margolin [1971]의 $\tau_{R/C}$ 통계량을 설명하고 그 값들을 Pearson의 상관계수와 비교 분석하였다.

  • PDF

Processes of Voluntary Services Delivered by Korean Undergraduates: An Approach Based on the Grounded Theory (대학생의 자발적 봉사활동에 대한 질적 연구: 근거이론을 중심으로)

  • Hu, Sungho;Jung, Taeyun
    • Korean Journal of Culture and Social Issue
    • /
    • v.17 no.3
    • /
    • pp.287-304
    • /
    • 2011
  • The Purpose of this study is to understand phases and paradigms related to voluntary services offered by undergraduates and the processes in which voluntary services are implemented. For this, interviews for 23(men: 10, women: 13) undergraduates were conducted from Aug., 2008 to Apr., 2009 were conducted and the data collected from those interviews were analyzed on the basis of the Grounded Theory. Main analysis procedure is known as codings(open coding, axial coding, selective coding). This analyses produced 119 concepts, 41 subcategories, and 16 categories in open coding. Then, axial coding was conducted to organize the basic framework of generic relationships among psychological motivation, social context, personal perception, practical action, psychological response, and psychological consequence. Core essence is "Volunteer types are categorized simple practice type, self-serving type, and community type." Finally, undergraduate volunteers were explained in 3 types(simple practice, self-serving, and community) on the basis of paradigms. These results were discussed in terms of further research and limitation.

  • PDF

Bayesian Analysis of Korean Alcohol Consumption Data Using a Zero-Inflated Ordered Probit Model (영 과잉 순서적 프로빗 모형을 이용한 한국인의 음주자료에 대한 베이지안 분석)

  • Oh, Man-Suk;Oh, Hyun-Tak;Park, Se-Mi
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.2
    • /
    • pp.363-376
    • /
    • 2012
  • Excessive zeroes are often observed in ordinal categorical response variables. An ordinary ordered Probit model is not appropriate for zero-inflated data especially when there are many different sources of generating 0 observations. In this paper, we apply a two-stage zero-inflated ordered Probit (ZIOP) model which incorporate the zero-flated nature of data, propose a Bayesian analysis of a ZIOP model, and apply the method to alcohol consumption data collected by the National Bureau of Statistics, Korea. In the first stage of a ZIOP model, a Probit model is introduced to divide the non-drinkers into genuine non-drinkers who do not participate in drinking due to personal beliefs or permanent health problems and potential drinkers who did not drink at the time of the survey but have the potential to become drinkers. In the second stage, an ordered probit model is applied to drinkers that consists of zero-consumption potential drinkers and positive consumption drinkers. The analysis results show that about 30% of non-drinkers are genuine non-drinkers and hence the Korean alcohol consumption data has the feature of zero-inflated data. A study on the marginal effect of each explanatory variable shows that certain explanatory variables have effects on the genuine non-drinkers and potential drinkers in opposite directions, which may not be detected by an ordered Probit model.

Comparing Accuracy of Imputation Methods for Categorical Incomplete Data (범주형 자료의 결측치 추정방법 성능 비교)

  • 신형원;손소영
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.1
    • /
    • pp.33-43
    • /
    • 2002
  • Various kinds of estimation methods have been developed for imputation of categorical missing data. They include category method, logistic regression, and association rule. In this study, we propose two fusions algorithms based on both neural network and voting scheme that combine the results of individual imputation methods. A Mont-Carlo simulation is used to compare the performance of these methods. Five factors used to simulate the missing data pattern are (1) input-output function, (2) data size, (3) noise of input-output function (4) proportion of missing data, and (5) pattern of missing data. Experimental study results indicate the following: when the data size is small and missing data proportion is large, modal category method, association rule, and neural network based fusion have better performances than the other methods. However, when the data size is small and correlation between input and missing output is strong, logistic regression and neural network barred fusion algorithm appear better than the others. When data size is large with low missing data proportion, a large noise, and strong correlation between input and missing output, neural networks based fusion algorithm turns out to be the best choice.

Measurement of Association of Categorical Data Using The Overlapped Mosaic Plot : Dynamic Graphics Approach for $2{\times}2$ Contingency Table ($2{\times}2$ 분할표에서 동적 그래픽스로 구현된 겹쳐진 모자익 그림을 이용한 범주형 자료의 연관성 측정)

  • Yoon, Yeo-Chang;Oh, Min-Gweon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.10 no.2
    • /
    • pp.457-464
    • /
    • 1999
  • In this paper, we propose an overlapped mosaic plot which proposed by Hartigan and Kleiner(1981) represents the counts in $2{\times}2$ contingency table directly by tiles whose area is proportional to the cell frequency. Overlapped mosaic plot provides some measurements of association including dynamic graphics for mosaic plots. Dynamic graphics for mosaic plots give some useful informations when one gets some measurements of association and selects a model, and current statistical software does not provide this feature. We can see the deviations between observation and estimate of independence from overlapped mosaic plot. This dynamic graphics give some useful informations how far this data are apart from independence.

  • PDF

Collapsibility Criteria using Raindrop Plots

  • 홍종선;김범준
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2004.11a
    • /
    • pp.175-178
    • /
    • 2004
  • 범주형 자료분석에서 차원축소(collapsibility)는 오즈비로 설명되었다. 실제의 $2\times2\timesK$ 분할표 자료를 이 이론에 적용시켰을 때 오즈비의 값으로 차원축소가 가능한지의 여부를 판단하기는 어렵다. 오즈비를 시각적으로 표현하는 방법 중에서 Doi, Nakamura와 Yamamoto(2001)가 제안한 Contour plot을 통해서 분할표 자료를 설명하는 것은 가능하지만 차원축소의 가능성을 결정하기에는 한계가 있다. 본 연구에서는 오즈비의 신뢰구간을 시각적으로 표현할 수 있는 방법으로 Barrowman과 Myers(2003)가 제안한 Raindrop plot을 이용하여 $P_{\lambda,;,T}^M-policy$ 분할표 자료를 설명할 수 있으며 동시에 차원축소의 가능성을 판단할 수 있는 방법을 제안하고자 한다.

  • PDF

A generalized logit model with mixed effects for categorical data (다가자료에 대한 혼합효과모형)

  • 최재성
    • The Korean Journal of Applied Statistics
    • /
    • v.15 no.1
    • /
    • pp.129-137
    • /
    • 2002
  • This paper suggests a generalized logit model with mixed effects for analysing frequency data in multi-contingency table. In this model nominal response variable is assumed to be polychotomous. When some factors are fixed but considered as ordinal and others are random, this paper shows how to use baseline-category logits to incoporate the mixed-effects of those factors into the model. A numerical algorithm was used to estimate model parameters by using marginal log-likelihood.