• Title/Summary/Keyword: Bayesian statistical method

Search Result 307, Processing Time 0.02 seconds

The Weighted Polya Posterior Confidence Interval For the Difference Between Two Independent Proportions (독립표본에서 두 모비율의 차이에 대한 가중 POLYA 사후분포 신뢰구간)

  • Lee Seung-Chun
    • The Korean Journal of Applied Statistics
    • /
    • v.19 no.1
    • /
    • pp.171-181
    • /
    • 2006
  • The Wald confidence interval has been considered as a standard method for the difference of proportions. However, the erratic behavior of the coverage probability of the Wald confidence interval is recognized in various literatures. Various alternatives have been proposed. Among them, Agresti-Caffo confidence interval has gained the reputation because of its simplicity and fairly good performance in terms of coverage probability. It is known however, that the Agresti-Caffo confidence interval is conservative. In this note, a confidence interval is developed using the weighted Polya posterior which was employed to obtain a confidence interval for the binomial proportion in Lee(2005). The resulting confidence interval is simple and effective in various respects such as the closeness of the average coverage probability to the nominal confidence level, the average expected length and the mean absolute error of the coverage probability. Practically it can be used for the interval estimation of the difference of proportions for any sample sizes and parameter values.

Context Aware Feature Selection Model for Salient Feature Detection from Mobile Video Devices (모바일 비디오기기 위에서의 중요한 객체탐색을 위한 문맥인식 특성벡터 선택 모델)

  • Lee, Jaeho;Shin, Hyunkyung
    • Journal of Internet Computing and Services
    • /
    • v.15 no.6
    • /
    • pp.117-124
    • /
    • 2014
  • Cluttered background is a major obstacle in developing salient object detection and tracking system for mobile device captured natural scene video frames. In this paper we propose a context aware feature vector selection model to provide an efficient noise filtering by machine learning based classifiers. Since the context awareness for feature selection is achieved by searching nearest neighborhoods, known as NP hard problem, we apply a fast approximation method with complexity analysis in details. Separability enhancement in feature vector space by adding the context aware feature subsets is studied rigorously using principal component analysis (PCA). Overall performance enhancement is quantified by the statistical measures in terms of the various machine learning models including MLP, SVM, Naïve Bayesian, CART. Summary of computational costs and performance enhancement is also presented.

Bayesian Model Selection for Linkage Analyses: Considering Collinear Predictors (연관분석을 위한 베이지안 모형 선택: 상호상관성 변수를 중심으로)

  • Suh, Young-Ju
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.533-541
    • /
    • 2005
  • We identify the correct chromosome and locate the corresponding markers close to the QTL in the linkage analysis of a quantitative trait by using the SSVS method. We consider several markers linked to the QTL, as well as to each oyher and thus the i.b.d. values at these loci generate collinear predictors to be evaluated when using the SSVS approach. The results on considering only closely linked markers to two QTL simultaneously showed clear evidence in favor of the closest marker to the QTL considered over other markers. The results of the analysis of collinear markers with SSVS showeed high concordance to those obtained using traditional multiple regression. We conclude based on this simulation study that the SSVS is quite useful to identify linkage with multiple linked markers simultaneously for a complex quantitative trait.

MCMC Algorithm for Dirichlet Distribution over Gridded Simplex (그리드 단체 위의 디리슐레 분포에서 마르코프 연쇄 몬테 칼로 표집)

  • Sin, Bong-Kee
    • KIISE Transactions on Computing Practices
    • /
    • v.21 no.1
    • /
    • pp.94-99
    • /
    • 2015
  • With the recent machine learning paradigm of using nonparametric Bayesian statistics and statistical inference based on random sampling, the Dirichlet distribution finds many uses in a variety of graphical models. It is a multivariate generalization of the gamma distribution and is defined on a continuous (K-1)-simplex. This paper presents a sampling method for a Dirichlet distribution for the problem of dividing an integer X into a sequence of K integers which sum to X. The target samples in our problem are all positive integer vectors when multiplied by a given X. They must be sampled from the correspondingly gridded simplex. In this paper we develop a Markov Chain Monte Carlo (MCMC) proposal distribution for the neighborhood grid points on the simplex and then present the complete algorithm based on the Metropolis-Hastings algorithm. The proposed algorithm can be used for the Markov model, HMM, and Semi-Markov model for accurate state-duration modeling. It can also be used for the Gamma-Dirichlet HMM to model q the global-local duration distributions.

Genetic Contribution of Indigenous Yakutian Cattle to Two Hybrid Populations, Revealed by Microsatellite Variation

  • Li, M.H.;Nogovitsina, E.;Ivanova, Z.;Erhardt, G.;Vilkki, J.;Popov, R.;Ammosov, I.;Kiselyova, T.;Kantanen, J.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.18 no.5
    • /
    • pp.613-619
    • /
    • 2005
  • Indigenous Yakutian cattle' adaptation to the hardest subarctic conditions makes them a valuable genetic resource for cattle breeding in the Siberian area. Since early last century, crossbreeding between native Yakutian cattle and imported Simmental and Kholmogory breeds has been widely adopted. In this study, variations at 22 polymorphic microsatellite loci in 5 populations of Yakutian, Kholmogory, Simmental, Yakutian-Kholmogory and Yakutian-Simmental cattle were analysed to estimate the genetic contribution of Yakutian cattle to the two hybrid populations. Three statistical approaches were used: the weighted least-squares (WLS) method which considers all allele frequencies; a recently developed implementation of a Markov chain Monte Carlo (MCMC) method called likelihood-based estimation of admixture (LEA); and a model-based Bayesian admixture analysis method (STRUCTURE). At population-level admixture analyses, the estimate based on the LEA was consistent with that obtained by the WLS method. Both methods showed that the genetic contribution of the indigenous Yakutian cattle in Yakutian-Kholmogory was small (9.6% by the LEA and 14.2% by the WLS method). In the Yakutian-Simmental population, the genetic contribution of the indigenous Yakutian cattle was considerably higher (62.8% by the LEA and 56.9% by the WLS method). Individual-level admixture analyses using STRUCTURE proved to be more informative than the multidimensional scaling analysis (MDSA) based on individual-based genetic distances. Of the 9 Yakutian-Simmental animals studied, 8 showed admixed origin, whereas of the 14 studied Yakutian-Kholmogory animals only 2 showed Yakutian ancestry (>5%). The mean posterior distributions of individual admixture coefficient (q) varied greatly among the samples in both hybrid populations. This study revealed a minor existing contribution of the Yakutian cattle in the Yakutian-Kholmogory hybrid population, but in the Yakutian-Simmental hybrid population, a major genetic contribution of the Yakutian cattle was seen. The results reflect the different crossbreeding patterns used in the development of the two hybrid populations. Additionally, molecular evidence for differences among individual admixture proportions was seen in both hybrid populations, resulting from the stochastic process in crossing over generations.

A Study on Establishment of Reference Value of CA 72-4 (CA 72-4 참고치 설정에 관한 연구)

  • An, Jae-Seok;Kim, Ji-Na;Joe, Ye-Ji;Yoon, Sang-Hyuk;Kim, Yoon-Cheol
    • The Korean Journal of Nuclear Medicine Technology
    • /
    • v.25 no.2
    • /
    • pp.25-28
    • /
    • 2021
  • Purpose CA 72-4 is a tumor marker that uses two monoclonal antibodies, CC49 and B72.3, to measure tumor-related glycoprotein(TAG72) in the serum. CA 72-4 is used to diagnose stomach, ovarian, and pancreatic cancers, and is known to perform high specificity for stomach cancer. The purpose of this study is to re-evaluate the reference value provided by the manufacturer through revalidation of the reference value in CA 72-4. Furthermore this study was conducted to provide useful help when making a clinical diagnosis at gastric cancer center. Materials and Methods We selected 271 patients who had been to health care center in national cancer center for the month of November 2020. The gender of the subjects was 140 males and 131 females, and the age group was from 30s to 60s. The reagent used in the study was a CA 72-4 IRMA KIT (ISOTOPES, Hungary) and the results were measured using a Dream Gamma-10 gamma counter (Shinjin medics, Korea). Results Statistical analysis of the results of this study used Hoffmann's method and Bayesian's method, which are primarily used in setting reference value. As a result of measuring CA 72-4 of 271 patients, the mean value was 4.54 U/mL and the median value was 3.30 U/mL. 24 people who deviated from 3SD were excluded from the measured value, the mean calculated after that was 3.53 U/mL, median was 3.00 U/mL and SD was 1.89. The reference value calculated based on this results was set to 7.31 U/mL. Conclusion The reference value provided by the manufacturer is less than 4 U/mL. It is slightly different from the value calculated in this study, 7.31 U/mL, so it seems necessary to reset the reference value according to the laboratory environment. Currently, we are receiving inquiries about the reference value from the center for gastric cancer at National Cancer Center. If additional research is carried out along with this study, it will be possible to set more accurate reference value.

Study on the Sea Level Pressure Prediction of Typhoon Period in South Coast of the Korean Peninsula Using the Neural Networks (신경망 모형을 이용한 태풍시기의 남해안 기압예측 연구)

  • Park, Jong-Kil;Kim, Byung-Soo;Jung, Woo-Sik;Seo, Jang-Won;Shon, Yong-Hee;Lee, Dae-Geun;Kim, Eun-Byul
    • Atmosphere
    • /
    • v.16 no.1
    • /
    • pp.19-31
    • /
    • 2006
  • The purpose of this study is to develop the statistical model to predict sea level pressure of typhoon period in south coast of the Korean Peninsula. Seven typhoons, which struck south coast of the Korean Peninsula, are selected for this study, and the data for analysis include the central pressure and location of typhoon, and sea level pressure and location of 19 observing site. Models employed in this study are the first order regression, the second order regression and the neural network. The dependent variable of each model is a 3-hr interval sea level pressure at each station. The cause variables are the central pressure of typhoon, distance between typhoon center and observing site, and sea level pressure of 3 hrs before, whereas the indicative variable reveals whether it is before or after typhoon passing. The data are classified into two groups - one is the full data obtained during typhoon period and the other is the data that sea level pressure is less than 1000 hPa. The stepwise selection method is used in the regression model while the node number is selected in the neural network by the Schwarz's Bayesian Criterion. The performance of each model is compared in terms of the root-mean square error. It turns out that the neural network shows better performance than other models, and the case using the full data produces similar or better results than the case using the other data.