• Title/Summary/Keyword: Markov chain Monte Carlo

Search Result 270, Processing Time 0.031 seconds

Variational Bayesian multinomial probit model with Gaussian process classification on mice protein expression level data (가우시안 과정 분류에 대한 변분 베이지안 다항 프로빗 모형: 쥐 단백질 발현 데이터에의 적용)

  • Donghyun Son;Beom Seuk Hwang
    • The Korean Journal of Applied Statistics
    • /
    • v.36 no.2
    • /
    • pp.115-127
    • /
    • 2023
  • Multinomial probit model is a popular model for multiclass classification and choice model. Markov chain Monte Carlo (MCMC) method is widely used for estimating multinomial probit model, but its computational cost is high. However, it is well known that variational Bayesian approximation is more computationally efficient than MCMC, because it uses subsets of samples. In this study, we describe multinomial probit model with Gaussian process classification and how to employ variational Bayesian approximation on the model. This study also compares the results of variational Bayesian multinomial probit model to the results of naive Bayes, K-nearest neighbors and support vector machine for the UCI mice protein expression level data.

Gas dynamics and star formation in NGC 6822

  • Park, Hye-Jin;Oh, Se-Heon;Wang, Jing;Zheng, Yun;Zhang, Hong-Xin;de Blok, W.J.G.
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.2
    • /
    • pp.70.2-71
    • /
    • 2021
  • We examine gas kinematics and star formation activities of NGC 6822, a gas-rich dwarf irregular galaxy in the Local Group at a distance of ~490 kpc. We perform profile decomposition of all the line-of-sight (LOS) HI velocity profiles of the high-resolution (42.4" × 12" spatial; 1.6 km/s spectral) HI data cube of the galaxy, taken with the Australian Telescope Compact Array (ATCA). To this end, we use a novel tool based on Bayesian Markov Chain Monte Carlo (MCMC) techniques, the so-called BAYGAUD, which allows us to decompose a velocity profile into an optimal number of Gaussian components in a quantitative manner. We group all the decomposed components into bulk-narrow, bulk-broad, and non-bulk gas components classified with respect to their velocity dispersions and the amounts of velocity offset from the global kinematics, respectively. Using the surface densities and velocity dispersions of the kinematically decomposed HI gas maps together with the rotation curve of NGC 6822, we derive Toomre-Q parameters for individual regions of the galaxy which quantify the level of local gravitational instability of the gaseous disk. We also measure the local star formation rate (SFR) of the corresponding regions in the galaxy by combining GALEX Far-ultraviolet (FUV) and WISE 22㎛ images. We then relate the gas and SFR surface densities in order to investigate the local Kennicutt-Schmidt (K-S) law of gravitationally unstable regions which are selected from the Toomre Q analysis. Of the three groups, the bulk-narrow, bulk-broad and non-bulk gas components, we find that the lower Toomre-Q values the bulk-narrow gas components have, the more consistent with the linear extension of the K-S law derived from molecular hydrogen (H2) observations.

  • PDF

Rare Disaster Events, Growth Volatility, and Financial Liberalization: International Evidence

  • Bongseok Choi
    • Journal of Korea Trade
    • /
    • v.27 no.2
    • /
    • pp.96-114
    • /
    • 2023
  • Purpose - This paper elucidates a nexus between the occurrence of rare disaster events and the volatility of economic growth by distinguishing the likelihood of rare events from stochastic volatility. We provide new empirical facts based on a quarterly time series. In particular, we focus on the role of financial liberalization in spreading the economic crisis in developing countries. Design/methodology - We use quarterly data on consumption expenditure (real per capita consumption) from 44 countries, including advanced and developing countries, ending in the fourth quarter of 2020. We estimate the likelihood of rare event occurrences and stochastic volatility for countries using the Bayesian Markov chain Monte Carlo (MCMC) method developed by Barro and Jin (2021). We present our estimation results for the relationship between rare disaster events, stochastic volatility, and growth volatility. Findings - We find the global common disaster event, the COVID-19 pandemic, and thirteen country-specific disaster events. Consumption falls by about 7% on average in the first quarter of a disaster and by 4% in the long run. The occurrence of rare disaster events and the volatility of gross domestic product (GDP) growth are positively correlated (4.8%), whereas the rare events and GDP growth rate are negatively correlated (-12.1%). In particular, financial liberalization has played an important role in exacerbating the adverse impact of both rare disasters and financial market instability on growth volatility. Several case studies, including the case of South Korea, provide insights into the cause of major financial crises in small open developing countries, including the Asian currency crisis of 1998. Originality/value - This paper presents new empirical facts on the relationship between the occurrence of rare disaster events (or stochastic volatility) and growth volatility. Increasing data frequency allows for greater accuracy in assessing a country's specific risk. Our findings suggest that financial market and institutional stability can be vital for buffering against rare disaster shocks. It is necessary to preemptively strengthen the foundation for financial stability in developing countries and increase the quality of the information provided to markets.

Gas kinematics and star formation in NGC 6822

  • Park, Hye-Jin;Oh, Se-Heon;Wang, Jing;Zheng, Yun;Zhang, Hong-Xin;de Blok, W.J.G.
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.45 no.1
    • /
    • pp.61.4-62
    • /
    • 2020
  • We present H I gas kinematics and star formation activities of NGC 6822, a dwarf galaxy located in the Local Volume at a distance of ~490 kpc. We perform profile decomposition of the line-of-sight velocity profiles of the high-resolution (~42.4" × 12") spatial; ~1.6 km/s spectral) H I data cube taken with the Australia Telescope Compact Array (ATCA). For this, we use a new tool, the so-called BAYGAUD (BAYesian GAUssian Decompositor) which is based on Bayesian Markov Chain Monte Carlo (MCMC) techniques, allowing us to decompose a line-of-sight velocity profile into an optimal number of Gaussian components in a quantitative manner. We classify the decomposed H I gas components of NGC 6822 into kinematically cold, warm or hot ones with respect to their velocity dispersion: 1) cold: < 4 km/s, 2) warm: 4 ~ 8 km/s, 3) hot: > 8 km/s. We then derive the Toomre-Q parameters of NGC 6822 using the kinematically decomposed H I gas maps. We also correlate their gas surface densities with the surface star formation rates derived using both GALEX far-ultraviolet and WISE 22 micron data to examine the impact of gas turbulence caused by stellar feedback on the Kennicutt-Schmidt (K-S) law. The kinematically cold component is likely to better follow the linear extension of the Kennicutt-Schmidt (K-S) law for molecular hydrogen (H2) at the low gas surface density regime where H I is not saturated.

  • PDF

Realization of water distribution system digital twin model using parameter calibration model (상수도관망 디지털트윈 구현을 위한 해석 프로그램 매개변수 검보정 모형 개발)

  • Lee, Jaeyeon;Park, Jaehong;Lee, Seungyub
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.182-182
    • /
    • 2022
  • 4차산업 기술이 상수도 관망 분야에도 활발히 도입되며 스마트워터 구축에 기술적인 기반이 마련되고 있다. 이중 디지털트윈의 경우 컴퓨터에 현실 속 사물의 쌍둥이를 만들고, 현실에서 발생할 수 있는 상황을 컴퓨터로 시뮬레이션함으로써 결과를 미리 예측하는 기술로 정의된다. 즉, 디지털트윈의 핵심기술은 시각화와 시뮬레이션 모형의 연계로 실시간 상황 표출뿐만 아니라 시뮬레이션 모형 입력값의 미래 변화를 추정하여 해당 사물의 상태를 예측하는 것이라고 할 수 있다. 상수도 관망의 경우도 디지털트윈 모형 구축 시 정교한 시뮬레이션 모형과 연계를 통해 관측 데이터의 표출과 함께 미관측 지점의 데이터를 추정 및 표출하는 것이 중요하다. 본 연구에서는 디지털트윈 모형 구축에 가장 필수적이라고 할 수 있는 상수도 관망의 해석 프로그램 매개변수 검보정 모형을 소개한다. 대표적인 상수도 관망해석 프로그램인 EPANET2.2의경우 입력값으로 주로 수요량과 관로의 조도계수를 요구하며, 본 연구에서는 수요량은 알고 있는 것으로 가정하고 관로의 조도계수만 Markov-Chain Monte Carlo (MCMC)를 사용하여 검보정한다. 해당 모형은 (1) 실시간 조도계수 추정이 가능하면, (2) 동시에 누수 탐지가 가능하고, (3) 관로의 기능적 노후를 정의하여 향후 디지털트윈 모형 구현 시 관로 노후를 표출할 수 있는 기반을 구축한다. 우선 실시간 조도계수 추정은 데이터베이스와 연동하여 진행하며, MCMC 모형을 활용한 관로 별 조도계수의 분포에 따라 정상범위 내 변동이 발생하는지 여부를 판단한다. 이때 정상범위를 벗어난 변동이 발생하는 경우 잠재적 누수가 존재하는 것으로 가정하며, 콜모고로프-스미르노프(KS) 테스트를 통해 이를 판단한다. 기능적 노후는 관로의 통수능과 연관이 있으며, 추정한 조도계수에 따른 관로의 통수능을 산정하여 결과를 표출한다. 본 연구에서 제안한 모형은 향후 상수도 관망 디지털트윈 구현에 핵심 요소기술로 활용할 수 있을 것으로 기대한다.

  • PDF

A Search for Exoplanets around Northern Circumpolar Stars. IX. A Multi-Period Analysis of the M Giant HD 135438

  • Byeong-Cheol Lee;Jae-Rim Koo;Yeon-Ho Choi;Tae-Yang Bang;Beomdu Lim;Myeong-Gu Park;Gwanghui Jeong
    • Journal of The Korean Astronomical Society
    • /
    • v.56 no.2
    • /
    • pp.277-286
    • /
    • 2023
  • It is difficult to distinguish the pure signal produced by an orbiting planetary companion around giant stars from other possible sources, such as stellar spots, pulsations, or certain activities. Since 2003, we have obtained radial (RV) data from evolved stars using the high-resolution, fiber-fed Bohyunsan Observatory Echelle Spectrograph (BOES) at the Bohyunsan Optical Astronomy Observatory (BOAO). Here, we report the results of RV variations in the binary star HD 135438. We found two significant periods: 494.98 d with eccentricity of 0.23 and 8494.1 d with eccentricity of 0.83. Considering orbital stability, it is impossible to have two companions in such close orbits with high eccentricity. To determine the nature of the changes in the RV variability, we analyzed indicators of stellar spot and stellar chromospheric activity to find that there are no signals related to the significant period of 494.98 d. However, we calculated the upper limits of rotation period of the rotational velocity and found this to be 478-536 d. One possible interpretation is that this may be closely related to the rotational modulation of an orbital inclination at 67-90 degrees. The other signal corresponding to the period of 8494.1 d is probably associated with a stellar companion orbiting the giant star. A Markov Chain Monte Carlo (MCMC) simulation considering a single companion indicates that HD 135438 system hosts a stellar companion with 0.57+0.017 -0.017 M with an orbital period of 8498 d.

A Study on derivation of drought severity-duration-frequency curve through a non-stationary frequency analysis (비정상성 가뭄빈도 해석 기법에 따른 가뭄 심도-지속기간-재현기간 곡선 유도에 관한 연구)

  • Jeong, Minsu;Park, Seo-Yeon;Jang, Ho-Won;Lee, Joo-Heon
    • Journal of Korea Water Resources Association
    • /
    • v.53 no.2
    • /
    • pp.107-119
    • /
    • 2020
  • This study analyzed past drought characteristics based on the observed rainfall data and performed a long-term outlook for future extreme droughts using Representative Concentration Pathways 8.5 (RCP 8.5) climate change scenarios. Standardized Precipitation Index (SPI) used duration of 1, 3, 6, 9 and 12 months, a meteorological drought index, was applied for quantitative drought analysis. A single long-term time series was constructed by combining daily rainfall observation data and RCP scenario. The constructed data was used as SPI input factors for each different duration. For the analysis of meteorological drought observed relatively long-term since 1954 in Korea, 12 rainfall stations were selected and applied 10 general circulation models (GCM) at the same point. In order to analyze drought characteristics according to climate change, trend analysis and clustering were performed. For non-stationary frequency analysis using sampling technique, we adopted the technique DEMC that combines Bayesian-based differential evolution ("DE") and Markov chain Monte Carlo ("MCMC"). A non-stationary drought frequency analysis was used to derive Severity-Duration-Frequency (SDF) curves for the 12 locations. A quantitative outlook for future droughts was carried out by deriving SDF curves with long-term hydrologic data assuming non-stationarity, and by quantitatively identifying potential drought risks. As a result of performing cluster analysis to identify the spatial characteristics, it was analyzed that there is a high risk of drought in the future in Jeonju, Gwangju, Yeosun, Mokpo, and Chupyeongryeong except Jeju corresponding to Zone 1-2, 2, and 3-2. They could be efficiently utilized in future drought management policies.

A Phylogenetic Analysis of Otters (Lutra lutra) Inhabiting in the Gyeongnam Area Using D-Loop Sequence of mtDNA and Microsatellite Markers (경남지역 수달(Lutra lutra)의 mitochondrial DNA D-loop지역과 microsatellite marker를 이용한 계통유전학적 유연관계 분석)

  • Park, Moon-Sung;Lim, Hyun-Tae;Oh, Ki-Cheol;Moon, Young-Rok;Kim, Jong-Gap;Jeon, Jin-Tae
    • Journal of Life Science
    • /
    • v.21 no.3
    • /
    • pp.385-392
    • /
    • 2011
  • The otter (Lutra lutra) in Korea is classified as a first grade endangered species and is managed under state control. We performed a phylogenetic analysis of the otter that inhabits the Changnyeong, Jinju, and Geoje areas in Gyeongsangnamdo, Korea using mtDNA and microsatellite (MS) markers. As a result of the analysis using the 676-bp D-loop sequence of mtDNA, six haplotypes were estimated from five single nucleotide polymorphisms. The genetic distance between the Jinju and Geoje areas was greater than distances within the areas, and the distance between Jinju and Geoje was especially clear. From the phylogenetic tree estimated using the Bayesian Markov chain Monte Carlo analysis by the MrBays program, two subgroups, one containing samples from Jinju and the other containing samples from the Changnyeong and Geoje areas were clearly identified. The result of a parsimonious median-joining network analysis also showed two clear subgroups, supporting the result of the phylogenetic analysis. On the other hand, in the consensus tree estimated using the genetic distances estimated from the genotypes of 13 MS markers, there were clear two subgroups, one containing samples from the Jinju, Geoje and Changnyeong areas and the other containing samples from only the Jinju area. The samples were not identically classified into each subgroup defined by mtDNA and MS markers. It could be inferred that the differential classification of samples by the two different marker systems was because of the different characteristics of the marker systems used, that is, the mtDNA was for detecting maternal lineage and the MS markers were for estimating autosomal genetic distances. Nonetheless, the results from the two marker systems showed that there has been a progressive genetic fixation according to the habitats of the otters. Further analyses using not only newly developed MS markers that will possess more analytical power but also the whole mtDNA are needed. Expansion of the phylogenetic analysis using otter samples collected from the major habitats in Korea should be helpful in scientifically and efficiently maintaining and preserving them.

Survival Analysis for White Non-Hispanic Female Breast Cancer Patients

  • Khan, Hafiz Mohammad Rafiqullah;Saxena, Anshul;Gabbidon, Kemesha;Stewart, Tiffanie Shauna-Jeanne;Bhatt, Chintan
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.9
    • /
    • pp.4049-4054
    • /
    • 2014
  • Background: Race and ethnicity are significant factors in predicting survival time of breast cancer patients. In this study, we applied advanced statistical methods to predict the survival of White non-Hispanic female breast cancer patients, who were diagnosed between the years 1973 and 2009 in the United States (U.S.). Materials and Methods: Demographic data from the Surveillance Epidemiology and End Results (SEER) database were used for the purpose of this study. Nine states were randomly selected from 12 U.S. cancer registries. A stratified random sampling method was used to select 2,000 female breast cancer patients from these nine states. We compared four types of advanced statistical probability models to identify the best-fit model for the White non-Hispanic female breast cancer survival data. Three model building criterion were used to measure and compare goodness of fit of the models. These include Akaike Information Criteria (AIC), Bayesian Information Criteria (BIC), and Deviance Information Criteria (DIC). In addition, we used a novel Bayesian method and the Markov Chain Monte Carlo technique to determine the posterior density function of the parameters. After evaluating the model parameters, we selected the model having the lowest DIC value. Using this Bayesian method, we derived the predictive survival density for future survival time and its related inferences. Results: The analytical sample of White non-Hispanic women included 2,000 breast cancer cases from the SEER database (1973-2009). The majority of cases were married (55.2%), the mean age of diagnosis was 63.61 years (SD = 14.24) and the mean survival time was 84 months (SD = 35.01). After comparing the four statistical models, results suggested that the exponentiated Weibull model (DIC= 19818.220) was a better fit for White non-Hispanic females' breast cancer survival data. This model predicted the survival times (in months) for White non-Hispanic women after implementation of precise estimates of the model parameters. Conclusions: By using modern model building criteria, we determined that the data best fit the exponentiated Weibull model. We incorporated precise estimates of the parameter into the predictive model and evaluated the survival inference for the White non-Hispanic female population. This method of analysis will assist researchers in making scientific and clinical conclusions when assessing survival time of breast cancer patients.

At-site Low Flow Frequency Analysis Using Bayesian MCMC: II. Application and Comparative Studies (Bayesian MCMC를 이용한 저수량 점 빈도분석: II. 적용과 비교분석)

  • Kim, Sang-Ug;Lee, Kil-Seong
    • Journal of Korea Water Resources Association
    • /
    • v.41 no.1
    • /
    • pp.49-63
    • /
    • 2008
  • The Bayesian MCMC(Bayesian Markov Chain Monte Carlo) and the MLE(Maximum Likelihood Estimation) methods using a quadratic approximation are applied to perform the at-site low flow frequency analysis at the 4 stage stations (Nakdong, Waegwan, Goryeonggyo, and Jindong). Using the results of two types of the estimation method, the frequency curves including uncertainty are plotted. Eight case studies using the synthetic flow data with a sample size of 100, generated from 2-parmeter Weibull distribution are performed to compare with the results of analysis using the MLE and the Bayesian MCMC. The Bayesian MCMC and the MLE are applied to 36 years of gauged data to validate the efficiency of the developed scheme. These examples illustrate the advantages of the Bayesian MCMC and the limitations of the MLE based on a quadratic approximation. From the point of view of uncertainty analysis, the Bayesian MCMC is more effective than the MLE using a quadratic approximation when the sample size is small. In particular, the Bayesian MCMC is a more attractive method than MLE based on a quadratic approximation because the sample size of low flow at the site of interest is mostly not enough to perform the low flow frequency analysis.