• Title/Summary/Keyword: Bayesian information

Search Result 1,224, Processing Time 0.035 seconds

Evaporative demand drought index forecasting in Busan-Ulsan-Gyeongnam region using machine learning methods (기계학습기법을 이용한 부산-울산-경남 지역의 증발수요 가뭄지수 예측)

  • Lee, Okjeong;Won, Jeongeun;Seo, Jiyu;Kim, Sangdan
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.8
    • /
    • pp.617-628
    • /
    • 2021
  • Drought is a major natural disaster that causes serious social and economic losses. Local drought forecasts can provide important information for drought preparedness. In this study, we propose a new machine learning model that predicts drought by using historical drought indices and meteorological data from 10 sites from 1981 to 2020 in the southeastern part of the Korean Peninsula, Busan-Ulsan-Gyeongnam. Using Bayesian optimization techniques, a hyper-parameter-tuned Random Forest, XGBoost, and Light GBM model were constructed to predict the evaporative demand drought index on a 6-month time scale after 1-month. The model performance was compared by constructing a single site model and a regional model, respectively. In addition, the possibility of improving the model performance was examined by constructing a fine-tuned model using data from a individual site based on the regional model.

HI gas kinematics of paired galaxies in the cluster environment from ASKAP pilot observations

  • Kim, Shin-Jeong;Oh, Se-Heon;Kim, Minsu;Park, Hye-Jin;Kim, Shinna
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.2
    • /
    • pp.70.1-70.1
    • /
    • 2021
  • We examine the HI gas kinematics and distributions of galaxy pairs in group or cluster environments from high-resolution Australian Square Kilometer Array Pathfinder (ASKAP) WALLABY pilot observations. We use 32 well-resolved close pair galaxies from the Hydra, Norma, and NGC 4636, two clusters and a group of which are identified by their spectroscopy information and additional visual inspection. We perform profile decomposition of HI velocity profiles of the galaxies using a new tool, BAYGAUD which allows us to separate a line-of-sight velocity profile into an optimal number of Gaussian components based on Bayesian MCMC techniques. Then, we construct super profiles via stacking of individual HI velocity profiles after aligning their central velocities. We fit a model which consists of double Gaussian components to the super profiles, and classify them as kinematically cold and warm HI gas components with respect to their velocity dispersions, narrower or wider 𝜎, respectively. The kinematically cold HI gas reservoir (M_cold/M_HI) of the paired galaxies is found to be relatively higher than that of unpaired control samples in the clusters and the group, showing a positive correlation with the HI mass in general. Additionally, we quantify the gravitational instability of the HI gas disk of the sample galaxies using their Toomre Q parameters and HI morphological disturbances. While no significant difference is found for the Q parameter values between the paired and unpaired galaxies, the paired galaxies tend to have larger HI asymmetry values which are derived using their moment0 map compared to those of the non-paired control sample galaxies in the distribution.

  • PDF

Skin Color Region Segmentation using classified 3D skin (계층화된 3차원 피부색 모델을 이용한 피부색 분할)

  • Park, Gyeong-Mi;Yoon, Ga-Rim;Kim, Young-Bong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.8
    • /
    • pp.1809-1818
    • /
    • 2010
  • In order to detect the skin color area from input images, many prior researches have divided an image into the pixels having a skin color and the other pixels. In a still image or videos, it is very difficult to exactly extract the skin pixels because lighting condition and makeup generate a various variations of skin color. In this thesis, we propose a method that improves its performance using hierarchical merging of 3D skin color model and context informations for the images having various difficulties. We first make 3D color histogram distributions using skin color pixels from many YCbCr color images and then divide the color space into 3 layers including skin color region(Skin), non-skin color region(Non-skin), skin color candidate region (Skinness). When we segment the skin color region from an image, skin color pixel and non-skin color pixels are determined to skin region and non-skin region respectively. If a pixel is belong to Skinness color region, the pixels are divided into skin region or non-skin region according to the context information of its neighbors. Our proposed method can help to efficiently segment the skin color regions from images having many distorted skin colors and similar skin colors.

Geographical Name Denoising by Machine Learning of Event Detection Based on Twitter (트위터 기반 이벤트 탐지에서의 기계학습을 통한 지명 노이즈제거)

  • Woo, Seungmin;Hwang, Byung-Yeon
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.4 no.10
    • /
    • pp.447-454
    • /
    • 2015
  • This paper proposes geographical name denoising by machine learning of event detection based on twitter. Recently, the increasing number of smart phone users are leading the growing user of SNS. Especially, the functions of short message (less than 140 words) and follow service make twitter has the power of conveying and diffusing the information more quickly. These characteristics and mobile optimised feature make twitter has fast information conveying speed, which can play a role of conveying disasters or events. Related research used the individuals of twitter user as the sensor of event detection to detect events that occur in reality. This research employed geographical name as the keyword by using the characteristic that an event occurs in a specific place. However, it ignored the denoising of relationship between geographical name and homograph, it became an important factor to lower the accuracy of event detection. In this paper, we used removing and forecasting, these two method to applied denoising technique. First after processing the filtering step by using noise related database building, we have determined the existence of geographical name by using the Naive Bayesian classification. Finally by using the experimental data, we earned the probability value of machine learning. On the basis of forecast technique which is proposed in this paper, the reliability of the need for denoising technique has turned out to be 89.6%.

Genetic Traceability of Black Pig Meats Using Microsatellite Markers

  • Oh, Jae-Don;Song, Ki-Duk;Seo, Joo-Hee;Kim, Duk-Kyung;Kim, Sung-Hoon;Seo, Kang-Seok;Lim, Hyun-Tae;Lee, Jae-Bong;Park, Hwa-Chun;Ryu, Youn-Chul;Kang, Min-Soo;Cho, Seoae;Kim, Eui-Soo;Choe, Ho-Sung;Kong, Hong-Sik;Lee, Hak-Kyo
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.27 no.7
    • /
    • pp.926-931
    • /
    • 2014
  • Pork from Jeju black pig (population J) and Berkshire (population B) has a unique market share in Korea because of their high meat quality. Due to the high demand of this pork, traceability of the pork to its origin is becoming an important part of the consumer demand. To examine the feasibility of such a system, we aim to provide basic genetic information of the two black pig populations and assess the possibility of genetically distinguishing between the two breeds. Muscle samples were collected from slaughter houses in Jeju Island and Namwon, Chonbuk province, Korea, for populations J and B, respectively. In total 800 Jeju black pigs and 351 Berkshires were genotyped at thirteen microsatellite (MS) markers. Analyses on the genetic diversity of the two populations were carried out in the programs MS toolkit and FSTAT. The population structure of the two breeds was determined by a Bayesian clustering method implemented in structure and by a phylogenetic analysis in Phylip. Population J exhibited higher mean number of alleles, expected heterozygosity and observed heterozygosity value, and polymorphism information content, compared to population B. The $F_{IS}$ values of population J and population B were 0.03 and -0.005, respectively, indicating that little or no inbreeding has occurred. In addition, genetic structure analysis revealed the possibility of gene flow from population B to population J. The expected probability of identify value of the 13 MS markers was $9.87{\times}10^{-14}$ in population J, $3.17{\times}10^{-9}$ in population B, and $1.03{\times}10^{-12}$ in the two populations. The results of this study are useful in distinguishing between the two black pig breeds and can be used as a foundation for further development of DNA markers.

Models for Estimating Genetic Parameters of Milk Production Traits Using Random Regression Models in Korean Holstein Cattle

  • Cho, C.I.;Alam, M.;Choi, T.J.;Choy, Y.H.;Choi, J.G.;Lee, S.S.;Cho, K.H.
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.29 no.5
    • /
    • pp.607-614
    • /
    • 2016
  • The objectives of the study were to estimate genetic parameters for milk production traits of Holstein cattle using random regression models (RRMs), and to compare the goodness of fit of various RRMs with homogeneous and heterogeneous residual variances. A total of 126,980 test-day milk production records of the first parity Holstein cows between 2007 and 2014 from the Dairy Cattle Improvement Center of National Agricultural Cooperative Federation in South Korea were used. These records included milk yield (MILK), fat yield (FAT), protein yield (PROT), and solids-not-fat yield (SNF). The statistical models included random effects of genetic and permanent environments using Legendre polynomials (LP) of the third to fifth order (L3-L5), fixed effects of herd-test day, year-season at calving, and a fixed regression for the test-day record (third to fifth order). The residual variances in the models were either homogeneous (HOM) or heterogeneous (15 classes, HET15; 60 classes, HET60). A total of nine models (3 orders of $polynomials{\times}3$ types of residual variance) including L3-HOM, L3-HET15, L3-HET60, L4-HOM, L4-HET15, L4-HET60, L5-HOM, L5-HET15, and L5-HET60 were compared using Akaike information criteria (AIC) and/or Schwarz Bayesian information criteria (BIC) statistics to identify the model(s) of best fit for their respective traits. The lowest BIC value was observed for the models L5-HET15 (MILK; PROT; SNF) and L4-HET15 (FAT), which fit the best. In general, the BIC values of HET15 models for a particular polynomial order was lower than that of the HET60 model in most cases. This implies that the orders of LP and types of residual variances affect the goodness of models. Also, the heterogeneity of residual variances should be considered for the test-day analysis. The heritability estimates of from the best fitted models ranged from 0.08 to 0.15 for MILK, 0.06 to 0.14 for FAT, 0.08 to 0.12 for PROT, and 0.07 to 0.13 for SNF according to days in milk of first lactation. Genetic variances for studied traits tended to decrease during the earlier stages of lactation, which were followed by increases in the middle and decreases further at the end of lactation. With regards to the fitness of the models and the differential genetic parameters across the lactation stages, we could estimate genetic parameters more accurately from RRMs than from lactation models. Therefore, we suggest using RRMs in place of lactation models to make national dairy cattle genetic evaluations for milk production traits in Korea.

Examining Impact of Weather Factors on Apple Yield (사과생산량에 영향을 미치는 기상요인 분석)

  • Kim, Mi Ri;Kim, Seung Gyu
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.16 no.4
    • /
    • pp.274-284
    • /
    • 2014
  • Crops and varieties are mostly affected by temperature, the amount of precipitation, and duration of sunshine. This study aims to identify the weather factors that directly influence to apple yield among the series of daily measured weather variables during growing seasons. In order to identify them, 1) a priori natural scientific knowledge with respect to the growth stage of apples and 2) pure statistical approaches to minimize bias due to the subject selection of variables are considered. Each result estimated by the Panel regression using fixed/random effect models is evaluated through suitability (i.e., Akaike information criterion and Bayesian information criterion) and predictability (i.e., mean absolute error, root mean square error, mean absolute percentage). The Panel data of apple yield and weather factors are collected from fifteen major producing areas of apples from 2006 to 2013 in Korea for the case study. The result shows that variable selection using factor analysis, which is one of the statistical approaches applied in the analysis, increases predictability and suitability most. It may imply that all the weather factors are important to predict apple yield if statistical problems, such as multicollinearity and lower degree of freedom due to too many explanatory variables used in the regression, can be controlled effectively. This may be because whole growth stages, such as germination, florescence, fruit setting, fatting, ripening, coloring, and harvesting, are affected by weather.

Probabilistic Optimization for Improving Soft Marine Ground using a Low Replacement Ratio (해상 연약지반의 저치환율 개량에 대한 확률론적 최적화)

  • Han, Sang-Hyun;Kim, Hong-Yeon;Yea, Geu-Guwen
    • The Journal of Engineering Geology
    • /
    • v.26 no.4
    • /
    • pp.485-495
    • /
    • 2016
  • To reinforce and improve the soft ground under a breakwater while using materials efficiently, the replacement ratio and leaving periods of surcharge load are optimized probabilistically. The results of Bayesian updating of the random variables using prior information decrease uncertainty by up to 39.8%, and using prior information with more samples results in a sharp decrease in uncertainty. Replacement ratios of 15%-40% are analyzed using First Order Reliability Method and Monte Carlo simulation to optimize the replacement ratio. The results show that replacement ratios of 20% and 25% are acceptable at the column jet grouting area and the granular compaction pile area, respectively. Life cycle costs are also compared to optimize the replacement ratios within allowable ranges. The results show that a range of 20%-30% is the most economical during the total life cycle. This means that initial construction cost, maintenance cost and failure loss cost are minimized during total life cycle. Probabilistic analysis for leaving periods of shows that three months acceptable. Design optimization with respect to life cycle cost is important to minimize maintenance costs and retain the performance of the structures for the required period. Therefore, more case studies that consider the maintenance costs of soil structures are necessary to establish relevant design codes.

Intercomparison of Change Point Analysis Methods for Identification of Inhomogeneity in Rainfall Series and Applications (강우자료의 비동질성 규명을 위한 변동점 분석기법의 상호비교 및 적용)

  • Lee, Sangho;Kim, Sang Ug;Lee, Yeong Seob;Sung, Jang Hyun
    • Journal of Korea Water Resources Association
    • /
    • v.47 no.8
    • /
    • pp.671-684
    • /
    • 2014
  • Change point analysis is a efficient tool to understand the fundamental information in hydro-meteorological data such as rainfall, discharge, temperature etc. Especially, this fundamental information to change points to future rainfall data identified by reasonable detection skills can affect the prediction of flood and drought occurrence because well detected change points provide a key to resolve the non-stationary or inhomogeneous problem by climate change. Therefore, in this study, the comparative study to assess the performance of the 3 change point detection skills, cumulative sum (CUSUM) method, Bayesian change point (BCP) method, and segmentation by dynamic programming (DP) was performed. After assessment of the performance of the proposed detection skills using the 3 types of the synthetic series, the 2 reasonable detection skills were applied to the observed and future rainfall data at the 5 rainfall gauges in South Korea. Finally, it was suggested that BCP (with 0.9 posterior probability) could be best detection skill and DP could be reasonably recommended through the comparative study. Also it was suggested that BCP (with 0.9 posterior probability) and DP detection skills to find some change points could be reasonable at the North-eastern part in South Korea. In future, the results in this study can be efficiently used to resolve the non-stationary problems in hydrological modeling considering inhomogeneity or nonstationarity.

An estimation method for non-response model using Monte-Carlo expectation-maximization algorithm (Monte-Carlo expectation-maximaization 방법을 이용한 무응답 모형 추정방법)

  • Choi, Boseung;You, Hyeon Sang;Yoon, Yong Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.587-598
    • /
    • 2016
  • In predicting an outcome of election using a variety of methods ahead of the election, non-response is one of the major issues. Therefore, to address the non-response issue, a variety of methods of non-response imputation may be employed, but the result of forecasting tend to vary according to methods. In this study, in order to improve electoral forecasts, we studied a model based method of non-response imputation attempting to apply the Monte Carlo Expectation Maximization (MCEM) algorithm, introduced by Wei and Tanner (1990). The MCEM algorithm using maximum likelihood estimates (MLEs) is applied to solve the boundary solution problem under the non-ignorable non-response mechanism. We performed the simulation studies to compare estimation performance among MCEM, maximum likelihood estimation, and Bayesian estimation method. The results of simulation studies showed that MCEM method can be a reasonable candidate for non-response model estimation. We also applied MCEM method to the Korean presidential election exit poll data of 2012 and investigated prediction performance using modified within precinct error (MWPE) criterion (Bautista et al., 2007).