• Title/Summary/Keyword: bayesian model

검색결과 1,323건 처리시간 0.032초

데이터마이닝의 베이지안 망 기법을 이용한 교통수단선택 모형의 설계 및 구축 (Design and Implementation of Travel Mode Choice Model Using the Bayesian Networks of Data Mining)

  • 김현기;김강수;이상민
    • 대한교통학회지
    • /
    • 제22권2호
    • /
    • pp.77-86
    • /
    • 2004
  • 데이터마이닝 (Data Mining)은 대용량의 데이터에 존재하는 관계, 패턴, 규칙 등을 효율적으로 탐색하여 이를 모형화함으로써, 유용한 정보로 추출 변환하는 일련의 과정이다. 특히 베이지안 망 (Bayesian Network)은 신경망, 유전자알고리즘 퍼지이론 등과 더불어 데이터마이닝의 중요한 기법 중의 하나로서 베이지안 통계 이론(Bayesian Statistics Theory)를 적용하여 변수들간의 확률적인 관계를 기호화함으로써, 설명변수들과 종속변수들간의 인과관계를 파악할 수 있다. 이 연구는 기존에 적용된 바가 없는 데이터마이닝의 베이지안 망을 이용하여 수도권 교통수단선택 모형을 구축한다. 2002년도 수도권 가구통행실태조사 자료의 사회 경제적 특성과 교통체계 특성을 반영하여 베이지안 망을 이용한 교통수단선택 모형을 설계 구축하여, 각 변수들간의 상관관계와 인과관계를 분석함으로써, 설명변수인 성과 연령의 구성비가 변하였을 때, 교통수단선택의 변화율(확률)을 예측한다. 이 연구를 통해 현실에서는 내재하나 설명변수간의 복잡한 상관성을 배제하고 설명변수들과 교통수단선택간의 단순한 직선관계를 가정하는 기존 교통수단선택 모형의 한계를 극복할 수 있는 가능성을 제시한다. 또한 선택되지 않은 교통수단에 대한 정보의 부족으로 인한 교통수단선택 모형 구축의 어려움을 극복한다. 또한 다양한 교통정책에 따른 교통수단선택의 변화를 실시간으로 시뮬레이션 할 수 있는 방법론을 개발한다.

Bayesian Copula 모형을 활용한 시간단위 강우량 상세화 기법 모형 개발 (A Development of Downscaling Model for Sub-daily Rainfall Based on Bayesian Copula model)

  • 김진영;소병진;권덕순;권현한
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2016년도 학술발표회
    • /
    • pp.229-229
    • /
    • 2016
  • 현재 국내외에서 제공되고 있는 기후변화 시나리오 자료의 경우 일단위로 제공되고 있다. 그러나 수자원 설계 및 계획 시 중요한 입력자료 중 하나는 시간단위 강우 자료이다. 이러한 시간단위 자료는 강우-유추 분석, 댐 설계 및 위험도 분석에 있어 중요한 입력 변수중 하나이므로 기후변화 시나리오에 따른 영향을 평가하기 위해선 신뢰성 있는 상세화 기법이 필요하다. 국내외에서는 일단위에서 일단위로 상세화 하는 기법, 또는 공간상세화 기법 연구는 상당히 진행된바 있는 반면, 시간단위 상세화 기법 연구는 일단위 연구에 비해 상대적으로 미진한 실정이다. 즉 일단위 상세화 기법의 경우 Weather generator, Weather typing 등 다양한 기법이 존재하고 이를 활용한 연구사례가 많지만, 시간단위 상세화 기법의 Poisson 기법을 활용한 사례가 다수 존재하였다. 이러한 이유로 본 연구에서는 기후변화 시나리오에 따른 영향을 평가하기 위해 Bayesian 기법을 도입하여 신뢰성 있는 시간단위 강우량을 생성할 수 있는 모형을 개발하였으며, 연대별로 산정된 결과는 빈도해석을 통해 미래 확률강우량을 제시하였다. 본 연구에서 제안하고자 하는 Bayesian Copula 모형은 기존 주변확률분포(marginal distribution) 매개변수와 Copula 매개변수 추정시 각각 다른 기법을 활용하여 추정하며, 각각 모형에서 발생하는 불확실성은 추정하지 못하는 반면, Bayesian Copula 모형의 경우 매개변수의 사후분포를 정량적으로 제시할 수 있으며, 추정되는 확률강우량 역시 불확실성을 정량적으로 산정할 수 있는 장점을 확인할 수 있었다.

  • PDF

Bayesian Analysis for Multiple Capture-Recapture Models using Reference Priors

  • Younshik;Pongsu
    • Communications for Statistical Applications and Methods
    • /
    • 제7권1호
    • /
    • pp.165-178
    • /
    • 2000
  • Bayesian methods are considered for the multiple caputure-recapture data. Reference priors are developed for such model and sampling-based approach through Gibbs sampler is used for inference from posterior distributions. Furthermore approximate Bayes factors are obtained for model selection between trap and nontrap response models. Finally one methodology is implemented for a capture-recapture model in generated data and real data.

  • PDF

Bayesian Test for the Intraclass Correlation Coefficient in the One-Way Random Effect Model

  • Kang, Sang-Gil;Lee, Hee-Choon
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권3호
    • /
    • pp.645-654
    • /
    • 2004
  • In this paper, we develop the Bayesian test procedure for the intraclass correlation coefficient in the unbalanced one-way random effect model based on the reference priors. That is, the objective is to compare two nested model such as the independent and intraclass models using the factional Bayes factor. Thus the model comparison problem in this case amounts to testing the hypotheses $H_1:\rho=0$ versus $H_2:{\rho}{\neq}0$. Some real data examples are provided.

  • PDF

Statistical Applications for the Prediction of White Hispanic Breast Cancer Survival

  • Khan, Hafiz Mohammad Rafiqullah;Saxena, Anshul;Gabbidon, Kemesha;Ross, Elizabeth;Shrestha, Alice
    • Asian Pacific Journal of Cancer Prevention
    • /
    • 제15권14호
    • /
    • pp.5571-5575
    • /
    • 2014
  • Background: The ability to predict the survival time of breast cancer patients is important because of the potential high morbidity and mortality associated with the disease. To develop a predictive inference for determining the survival of breast cancer patients, we applied a novel Bayesian method. In this paper, we propose the development of a databased statistical probability model and application of the Bayesian method to predict future survival times for White Hispanic female breast cancer patients, diagnosed in the US during 1973-2009. Materials and Methods: A stratified random sample of White Hispanic female patient survival data was selected from the Surveillance Epidemiology and End Results (SEER) database to derive statistical probability models. Four were considered to identify the best-fit model. We used three standard model-building criteria, which included Akaike Information Criteria (AIC), Bayesian Information Criteria (BIC), and Deviance Information Criteria (DIC) to measure the goodness of fit. Furthermore, the Bayesian method was used to derive future survival inferences for survival times. Results: The highest number of White Hispanic female breast cancer patients in this sample was from New Mexico and the lowest from Hawaii. The mean (SD) age at diagnosis (years) was 58.2 (14.2). The mean (SD) of survival time (months) for White Hispanic females was 72.7 (32.2). We found that the exponentiated Weibull model best fit the survival times compared to other widely known statistical probability models. The predictive inference for future survival times is presented using the Bayesian method. Conclusions: The findings are significant for treatment planning and health-care cost allocation. They should also contribute to further research on breast cancer survival issues.

Complex Segregation Analysis of Categorical Traits in Farm Animals: Comparison of Linear and Threshold Models

  • Kadarmideen, Haja N.;Ilahi, H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • 제18권8호
    • /
    • pp.1088-1097
    • /
    • 2005
  • Main objectives of this study were to investigate accuracy, bias and power of linear and threshold model segregation analysis methods for detection of major genes in categorical traits in farm animals. Maximum Likelihood Linear Model (MLLM), Bayesian Linear Model (BALM) and Bayesian Threshold Model (BATM) were applied to simulated data on normal, categorical and binary scales as well as to disease data in pigs. Simulated data on the underlying normally distributed liability (NDL) were used to create categorical and binary data. MLLM method was applied to data on all scales (Normal, categorical and binary) and BATM method was developed and applied only to binary data. The MLLM analyses underestimated parameters for binary as well as categorical traits compared to normal traits; with the bias being very severe for binary traits. The accuracy of major gene and polygene parameter estimates was also very low for binary data compared with those for categorical data; the later gave results similar to normal data. When disease incidence (on binary scale) is close to 50%, segregation analysis has more accuracy and lesser bias, compared to diseases with rare incidences. NDL data were always better than categorical data. Under the MLLM method, the test statistics for categorical and binary data were consistently unusually very high (while the opposite is expected due to loss of information in categorical data), indicating high false discovery rates of major genes if linear models are applied to categorical traits. With Bayesian segregation analysis, 95% highest probability density regions of major gene variances were checked if they included the value of zero (boundary parameter); by nature of this difference between likelihood and Bayesian approaches, the Bayesian methods are likely to be more reliable for categorical data. The BATM segregation analysis of binary data also showed a significant advantage over MLLM in terms of higher accuracy. Based on the results, threshold models are recommended when the trait distributions are discontinuous. Further, segregation analysis could be used in an initial scan of the data for evidence of major genes before embarking on molecular genome mapping.

Bayesian Hypothesis Testing for Intraclass Correlation Coefficient

  • Lee, Seung-A;Kim, Dal-Ho
    • Communications for Statistical Applications and Methods
    • /
    • 제13권3호
    • /
    • pp.551-566
    • /
    • 2006
  • In this paper, we consider a Bayesian model selection for the intraclass correlation coefficient in familiar data. In particular, we compare two nested models such as the independence and intraclass models using the reference prior. A criterion for testing is the Bayesian Reference Criterion by Bernardo (1999) and the Intrinsic Bayes Factor by Berger and Pericchi (1996). We provide numerical examples using simulation data sets for illustration.

시변환 스트레스 조건에서의 와이블 분포의 모수 및 가속 모수에 대한 베이시안 추정을 사용하는 이산 시간 접근 방법 (A Discrete Time Approximation Method using Bayesian Inference of Parameters of Weibull Distribution and Acceleration Parameters with Time-Varying Stresses)

  • 정인승
    • 대한기계학회:학술대회논문집
    • /
    • 대한기계학회 2008년도 추계학술대회A
    • /
    • pp.1331-1336
    • /
    • 2008
  • This paper suggests a method using Bayesian inference to estimate the parameters of Weibull distribution and acceleration parameters under the condition that the stresses are time-dependent functions. A Bayesian model based on the discrete time approximation is formulated to infer the parameters of interest from the failure data of the virtual tests and a statistical analysis is considered to decide the most probable mean values of the parameters for reasoning of the failure data.

  • PDF

A Bayesian Approach to PM Model with Random Maintenance Quality

  • Jung, Ki-Mun
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권3호
    • /
    • pp.689-696
    • /
    • 2007
  • This paper considers a Bayesian approach to determine an optimal PM policy with random maintenance quality. Thus, we assume that the quality of a PM action is a random variable following a probability distribution. When the failure time is Weibull distribution with uncertain parameters, a Bayesian approach is established to formally express and update the uncertain parameters for determining an optimal PM policy. Finally, the numerical examples are presented for illustrative purpose.

  • PDF

Multivariable Bayesian curve-fitting under functional measurement error model

  • Hwang, Jinseub;Kim, Dal Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • 제27권6호
    • /
    • pp.1645-1651
    • /
    • 2016
  • A lot of data, particularly in the medical field, contain variables that have a measurement error such as blood pressure and body mass index. On the other hand, recently smoothing methods are often used to solve a complex scientific problem. In this paper, we study a Bayesian curve-fitting under functional measurement error model. Especially, we extend our previous model by incorporating covariates free of measurement error. In this paper, we consider penalized splines for non-linear pattern. We employ a hierarchical Bayesian framework based on Markov Chain Monte Carlo methodology for fitting the model and estimating parameters. For application we use the data from the fifth wave (2012) of the Korea National Health and Nutrition Examination Survey data, a national population-based data. To examine the convergence of MCMC sampling, potential scale reduction factors are used and we also confirm a model selection criteria to check the performance.