• 제목/요약/키워드: multinomial sampling

검색결과 21건 처리시간 0.026초

A Bayes Sequential Selection of the Least Probale Event

  • Hwang, Hyung-Tae;Kim, Woo-Chul
    • Journal of the Korean Statistical Society
    • /
    • 제11권1호
    • /
    • pp.25-35
    • /
    • 1982
  • A problem of selecting the least probable cell in a multinomial distribution is studied in a Bayesian framework. We consider two loss components the cost of sampling and the difference in cell probabilities between the selected and the least probable cells. A Bayes sequential selection rule is derived with respect to a Dirichlet prior, and it is compared with the best fixed sample size selection rule. The continuation sets with respect to the vague prior are tabulated for certain cases.

  • PDF

이단계 군집분석에 의한 농촌관광 편의시설 유형별 소비자 선호 결정요인 (Determinants of Consumer Preference by type of Accommodation: Two Step Cluster Analysis)

  • 박덕병;윤유식;이민수
    • 마케팅과학연구
    • /
    • 제17권3호
    • /
    • pp.1-19
    • /
    • 2007
  • 본 연구에서는 농촌관광 방문객에게 제공되는 편의시설을 유형화하고 어떤 특징을 가진 방문객이 어떤 편의시설을 선호하는지를 규명하기 위한 방법과 그 분석결과를 제시하였다. 이를 위하여 우선 2단계 군집분석법을 사용하여 농촌관광 편의시설을 유형화하였다. 그 다음으로 군집분석에 사용되는 변인이 범주형 변인이 있을 경우 전통적인 군집분석 방법을 적용할 수 없기 때문에 2단계 군집분석을 하였다. 본 연구는 2단계 군집분석법이 범주형 변인으로 측정된 농촌관광의 편의시설을 유형화하는 데 매우 유용하다는 것을 보여 주고 있다. 다중로짓 모형을 사용하여 특정 편의시설 유형을 선호할 확률에 영향을 미치는 농촌관광 방문자의 사회인구학적 특성과 여행특성을 규명하였다. 즉, 다중로짓 모형을 통해 참조항(일반농가형)으로 설정된 편의시설 유형에 비해 특정 편의시설을 선호할 확률에 영향을 미치는 소비자의 특성을 규명할 수 있다는 것이 본 연구의 특징이다.

  • PDF

Bayesian Methods for Generalized Linear Models

  • Paul E. Green;Kim, Dae-Hak
    • Communications for Statistical Applications and Methods
    • /
    • 제6권2호
    • /
    • pp.523-532
    • /
    • 1999
  • Generalized linear models have various applications for data arising from many kinds of statistical studies. Although the response variable is generally assumed to be generated from a wide class of probability distributions we focus on count data that are most often analyzed using binomial models for proportions or poisson models for rates. The methods and results presented here also apply to many other categorical data models in general due to the relationship between multinomial and poisson sampling. The novelty of the approach suggested here is that all conditional distribution s can be specified directly so that staraightforward Gibbs sampling is possible. The prior distribution consists of two stages. We rely on a normal nonconjugate prior at the first stage and a vague prior for hyperparameters at the second stage. The methods are demonstrated with an illustrative example using data collected by Rosenkranz and raftery(1994) concerning the number of hospital admissions due to back pain in Washington state.

  • PDF

Effect of Bias on the Pearson Chi-squared Test for Two Population Homogeneity Test

  • Heo, Sunyeong
    • 통합자연과학논문집
    • /
    • 제5권4호
    • /
    • pp.241-245
    • /
    • 2012
  • Categorical data collected based on complex sample design is not proper for the standard Pearson multinomial-based chi-squared test because the observations are not independent and identically distributed. This study investigates effects of bias of point estimator of population proportion and its variance estimator to the standard Pearson chi-squared test statistics when the sample is collected based on complex sampling scheme. This study examines the effect under two population homogeneity test. The standard Pearson test statistic can be partitioned into two parts; the first part is the weighted sum of ${\chi}^2_1$ with eigenvalues of design matrix as their weights, and the additional second part which is added due to the biases of the point estimator and its variance estimator. Our empirical analysis shows that even though the bias of point estimator is small, Pearson test statistic is very much inflated due to underestimate the variance of point estimator. In the connection of design-based variance estimator and its design matrix, the bigger the average of eigenvalues of design matrix is, the larger relative size of which the first component part to Pearson test statistic is taking.

도시농업 활동 유형화 연구 (Segmentation and Characteristic Analysis of Urban Farmers Behavior)

  • 황정임;최윤지;장보경;이상영
    • 한국지역사회생활과학회지
    • /
    • 제21권4호
    • /
    • pp.619-631
    • /
    • 2010
  • The purpose of this study is to segment and examine urban farmers behavior by applying a two-step cluster analysis and multi-nominal logit model. The data were collected by a telephone survey with two-staged stratified random sampling in the cities around the country for the purpose of acquiring representative data. Respondents were asked to describe their awareness of urban agriculture, their agricultural activity, and sociodemographic characteristics. Among 2,000 cases, 381 cases(19.1%) which were of participants in urban agriculture were analysed in SPSS. From the findings, 27.3% of respondents had heard the word 'urban agriculture', and 25.5% of them regarded themselves as urban farmers. Four different clusters were derived from two-step clusters based on motive, place, companion, area and hours. They were 'Large scale hobby farming(cluster 1)', ‘Weekend farm/ hobby farming(cluster 2)', 'Land/ Self-supporting farming(cluster 3)', and 'Small scale hobby farming(cluster 4)'. The result of multinomial logistic regression showed that there were significant differences among these four segmented groups in terms of age, city size and housing type. In other words, there is quite a possibility that urbanites select different urban farming types according to their socio-demographic profiles. Therefore, the urbanite profiles can be used as the basis for promoting policy of several urban agriculture types. According to the result, policy directions for facilitating urban agriculture were presented.

RDD 표본 대 전화번호부 표본: 2007년 대통령 선거 예측사례 (RDD Sample versus Directory - Based Sample for Telephone Surveys: The Case of 2007 Presidential Election Forecasting in Korea)

  • 허명회;김영원
    • 한국조사연구학회지:조사연구
    • /
    • 제9권3호
    • /
    • pp.55-69
    • /
    • 2008
  • 이제까지 우리나라에서 전화조사를 위한 표본목록은 거의 대부분 전화번호부로부터 나왔다. 그러나 전화번호부의 모집단 포함률이 너무 떨어진다는 지적이 있어 대응수단으로 국제적 기준인 RDD(random digit dialing, 임의번호걸기)가 구현된 바 있다. 2007년 12월의 17대 대통령선거에 대한 예측을 위해 투표일을 $5{\sim}6$일 앞서 실시된 KBS MBC 전화조사는 표본을 반씩 나누어 절반은 RDD로, 나머지 절반은 전화번호부에서 응답자 표본목록을 추출하였다. 이 사례연구는 KBS MBC 전화조사의 RDD 표본과 전화번호부 표본을 대비시켜 공통점과 상이점을 살펴본 것이다. 향후 수년 동안 전화번호부 표본과 RDD 표본이 공존할 것으로 예상되는 상황에서, 이 연구결과가 두 방식의 비교에 시사점을 제시할 것으로 기대한다.

  • PDF

지역사회 정신건강복지센터를 이용하는 만성정신질환자의 비만 관련요인 (Obesity and Related-factors in Patients with Chronic Mental Illness Registered to Community Mental Health Welfare Centers)

  • 박은숙;이은현
    • 지역사회간호학회지
    • /
    • 제29권1호
    • /
    • pp.76-86
    • /
    • 2018
  • Purpose: The purpose of study was to examine the relationship between obesity and its associated factors (psychiatric symptom, duration of illness, type of medication, physical activity, dietary habits, depressive symptom, and stress) in patients with chronic mental illness registered to community mental health welfare centers. Methods: This was a cross-sectional correlation study using a convenience sampling. A total of 392 participants were recruited from community mental health welfare centers. The obtained data were analyzed using binary and multinomial logistic regression. Results: Atypical antipsychotic medication, duration of illness, dietary habits (overeating, and drinking instant coffee) were significantly contributed variables into body mass index (BMI) obesity. Atypical antipsychotic medication and instant coffee were significantly related to abdominal obesity. Conclusion: These results emphasized the needs of tailored obesity-preventive management for the community-dwelling patients with chronic mental illness, topically focusing on the administration of atypical antipsychotic medication, duration of illness, and dietary habits.

Use of Smokeless Tobacco among Male Students of Zahedan Universities in Iran: a Cross Sectional Study

  • Honarmand, Marieh;Farhadmollashahi, Leila;Bekyghasemi, Mahmoud
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제14권11호
    • /
    • pp.6385-6388
    • /
    • 2013
  • Background: Smokeless tobacco consumption is one of the causes of oral cancer. The aim of this study was to determine the prevalence of smokeless tobacco consumption among male students of Zahedan universities and associated factors in 2012. Materials and Methods: In this cross-sectional study, 431 students were selected from the universities of Zahedan using multi-stage random cluster sampling. The data collection tool was a questionnaire including questions about demographic information, history of smokeless tobacco consumption, and awareness of smokeless tobacco hazards. Data were analyzed by SPSS19 using Chi-square test and multinomial logistic regression, with p<0.05 considered significant. Results: At the time of conducting this study, 102 students (23.7%) had already consumed smokeless tobacco and 49 students (11.4%) were current users (consuming at least once in 30 days before the study). There was a significant relationship between history of smokeless tobacco consumption, university/college, place of living, mean GPA, and mother's education level (p<0.05). Also there was a significant association between knowledge and prevalence of smokeless tobacco use (p<0.001). Conclusions: There is a relatively high prevalence of smokeless tobacco consumption among the male students of universities of Zahedan, which shows the need to emphasize the provision and implementation of prevention programs in universities.

주변값이 주어진 이원분할표에 대한 카이제곱 검정통계량의 소표본 분포 및 대표본 분포와의 일치성 연구 (On the Small Sample Distribution and its Consistency with the Large Sample Distribution of the Chi-Squared Test Statistic for a Two-Way Contigency Table with Fixed Margins)

  • 박철용;최재성;김용곤
    • Journal of the Korean Data and Information Science Society
    • /
    • 제11권1호
    • /
    • pp.83-90
    • /
    • 2000
  • 이원분할표의 두 범주형 변수에 대한 독립성을 검정할 때 흔히 카이제곱 검정통계량이 사용된다. 표본추출 모형이 다항이나 곱다항인 경우 이 검정통계량이 독립성 가정하에서 근사적으로 카이제곱 분포를 따르게 되는 것은 잘 알려진 사실이다. 두 주변값이 모두 주어진 경우 독립성 가정하에서 표본추출 모형은 다중 초기하분포가 되며 앞의 모형과 마찬가지로 카이제곱 통계량에 근거한 검정을 사용할 수 있다. 이 연구에서는 주변값이 주어진 경우에 카이제곱 통계량의 소표본 분포를 대표본 분포인 카이제곱 분포와 비교하고자 한다. 표본크기가 작은 몇 개의 경우에 대해 카이제곱 통계량의 소표본 분포를 직접 계산해보았다. 표본크기가 큰 몇 개의 경우는 간단한 몬테칼로 알고리듬을 통해 소표본 분포를 생성하고 카이제곱 확률도와 콜모고로브-스미노브 단일표본 검정을 이용하여 대표본 분포와의 일치성을 알아보았다.

  • PDF

교통사고 데이터의 패턴 분석과 Hybrid Model을 이용한 피해자 상해 심각도 예측 (Pattern Analysis of Traffic Accident data and Prediction of Victim Injury Severity Using Hybrid Model)

  • 주영지;홍택은;신주현
    • 스마트미디어저널
    • /
    • 제5권4호
    • /
    • pp.75-82
    • /
    • 2016
  • 우리나라의 경제 성장과 도로 환경의 변화를 통해 국내 자동차 시장이 성장하였으나, 이로 인해 교통사고율 또한 증가하였고, 인명 피해가 심각한 수준이다. 이에 따라, 정부에서는 교통사고 데이터를 개방하고 문제를 해결하기 위한 정책을 수립 및 추진 중이다. 본 논문에서는 교통사고 데이터를 이용하여 클래스의 불균형을 해소하고, Hybrid Model 구축을 통한 교통사고 예측을 위해 원본 교통사고 데이터와 Sampling을 수행한 데이터를 학습 데이터로 사용한다. 두 학습데이터에 연관규칙 학습기법인 FP-Growth 알고리즘을 이용하여 교통사고 상해 심각도와 연관된 패턴을 학습한다. 두 학습 데이터의 연관 패턴을 분석을 통해 같은 연관된 패턴을 추출하고 의사결정트리와 다항 로지스틱 회귀분석기법에 연관된 속성에 가중치를 부여하여 융합형 Hybrid Model을 구축하고 교통사고 피해자 상해 심각도를 예측하는 방법에 대해 제안한다.