• Title/Summary/Keyword: Maximum Likelihood.

Search Result 2,121, Processing Time 0.031 seconds

Evidence for Polyphyletic Origin of the Members of the Subsection IV Cyanobacteria as Determined by 16S rRNA Analysis (16S rRNA 분석에 의한 Subsection IV cyanobacteria 균주들의 다계통성 기원의 증거)

  • Shin, Yong Kook;Seo, Pil-Soo
    • Journal of Life Science
    • /
    • v.26 no.10
    • /
    • pp.1202-1206
    • /
    • 2016
  • Unicellular cyanobacterial strains of Subsections I and II and filamentous cyanobacterial strains of Subsection III have been shown to be polyphyletic, heterocystous strains of Subsections IV and V, both of which were previously reported to be monophyletic. In this study, the small subunit ribosomal RNA (16S rRNA) sequences of 13 strains of cyanobacteria - one strain, Oscillatoria nigro-viridis PCC7112, of the Subsection III, 6 strains including genus Anabaena, Nostoc, Tolypothrix, Calothrix and Scytonema of the Subsection IV, and 6 strains including genus Hapalosiphon, Fischerella and Chlorogloeopsis of the Subsection V - were determined. The phylogenetic analysis of cyanobacteria was carried out using the 16S rRNA sequences. The results of the phylogenetic analyses of 16S rRNA sequences, based on Neighbour-joining, maximum-parsimony, and maximum-likelihood methods, indicated that the members of Subsection IV were not monophyletic but polyphyletic. In addition, the phylogenetic results strongly indicated that the genus Scytonema in Subsection IV could be a common ancestor of heterocystous cyanobacteria in Subsection IV and V. Furthermore, the phylogenetic analyses revealed that the genus Anabaena could be phylogenetically diverse and that cyanobacterial strains in Subsection IV might be polyphyletic, whereas those in Subsection V could be monophyletic, as reported before. The results for the genus Anabaena indicate that it should be reclassified.

Comparison Study on the Various Forms of Scale Parameter for the Nonstationary Gumbel Model (다양한 규모매개변수를 이용한 비정상성 Gumbel 모형의 비교 연구)

  • Jang, Hanjin;Kim, Sooyoung;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.48 no.5
    • /
    • pp.331-343
    • /
    • 2015
  • Most nonstationary frequency models are defined as the probability models containing the time-dependent parameters. For frequency analysis of annual maximum rainfall data, the Gumbel distribution is generally recommended in Korea. For the nonstationary Gumbel models, the time-dependent location and scale parameters are defined as linear and exponential relationship, respectively. The exponentially time-varying scale parameter of nonstationary Gumbel model is generally used because the scale parameter should be positive. However, the exponential form of scale parameter occasionally provides overestimated quantiles. In this study, various forms of time-varying scale parameters such as exponential, linear, and logarithmic forms were proposed and compared. The parameters were estimated based on the method of maximum likelihood. To compare the accuracy of each scale parameter, Monte Carlo simulation was performed for various conditions. Additionally, nonstationary frequency analysis was conducted for the sites which have more than 30 years data with a trend in rainfall data. As a result, nonstationary Gumbel model with exponentially time-varying scale parameter generally has the smallest root mean square error comparing with another forms.

A study on the variation of design flood due to climate change in the ungauged urban catchment (기후변화에 따른 미계측 도시유역의 확률홍수량 변화에 관한 연구)

  • Hwang, Jeongyoon;Ahn, Jeonghwan;Jeong, Changsam;Heo, Jun-Haeng
    • Journal of Korea Water Resources Association
    • /
    • v.51 no.5
    • /
    • pp.395-404
    • /
    • 2018
  • This research evaluated the change in rainfall quantile during S1, S2, and S3 by using Representative Concentration Pathways (RCP) 4.5 climate scenario HadGEM3-RA Regional Climate Model (RCM) produced by downscaling and bias correlation compared to the past standard observation data S0. Also, the maximum flood peak volume and flood area were calculated by using the urban runoff model and the impact of climate change was analyzed in each period. For this purpose, Gumbel distribution was used as an appropriate model based on the method of maximum likelihood. As a result, in the case of the 10 year-frequency which is the design of most urban drainage facilities, the rainfall quantile is in increased about 10% if we assume 50 years from now with the $3^{rd}$ quarter value and about 20% if we assume 70 years from now. This result implies that the installed urban drainage facility based on the currently set design flood volume cannot be met the design criteria in the future. Therefore, it is necessary to reflect future climate conditions to current urban drainage facilities.

A Derivation of Rainfall Intensity-Duration-Frequency Relationship for the Design of Urban Drainage System in Korea (우리나라 도시배수시스템 설계를 위한 확률강우강도식의 유도)

  • Lee, Jae-Jun;Lee, Jeong-Sik
    • Journal of Korea Water Resources Association
    • /
    • v.32 no.4
    • /
    • pp.403-415
    • /
    • 1999
  • This study is to derive the rainfall intensity formula based on the representative probability distribution in Korea. The 11 probability distributions which has been widely used in hydrologic frequency analysis are applied to the annual maximum rainfall. The parameters of each probability distribution are estimated by method of moments, maximum likelihood method and method of probability weighted moments. Four tests such as $x^2$-test, Kolmogorv-Smirnov test, difference test and modified difference test are used to determine the goodness of fit of the distributions. The homogeneous tests (Mann-Whitney U test, Kruskal-Wallis one-way analysis of variance of nonparametric test) are applied to find the stations with rainfall homogeneity. The results of homogeneous tests show that there is no representative appropriate distribution for the whole duration in Korea. The whole region could be divided into five zones for 12-durations. The representative probability distribution of each divided zone for 12-durations was determined. The GEV distribution for I,II,V zones and the 3-parameter Weibull distribution for III,IV zones were determined as the representative probability distribution. The rainfall were obtained from representative probability distribution for the selected return periods. Rainfall intensity formula was determined by linearization technique for the rainfall.

  • PDF

A Study on the Ordered Subsets Expectation Maximization Reconstruction Method Using Gibbs Priors for Emission Computed Tomography (Gibbs 선행치를 사용한 배열된부분집합 기대값최대화 방출단층영상 재구성방법에 관한 연구)

  • Im, K. C.;Choi, Y.;Kim, J. H.;Lee, S. J.;Woo, S. K.;Seo, H. K.;Lee, K. H.;Kim, S. E.;Choe, Y. S.;Park, C. C;Kim, B. T.
    • Journal of Biomedical Engineering Research
    • /
    • v.21 no.5
    • /
    • pp.441-448
    • /
    • 2000
  • 방출단층영상 재구성을 위한 최대우도 기대값최대화(maximum likelihood expectation maximization, MLEM) 방법은 영상 획득과정을 통계학적으로 모델링하여 영상을 재구성한다. MLEM은 일반적으로 사용하여 여과후역투사(filtered backprojection)방법에 비해 많은 장점을 가지고 있으나 반복횟수 증가에 따른 발산과 재구성 시간이 오래 걸리는 단점을 가지고 있다. 이 논문에서는 이러한 단점을 보완하기 위해 계산시간을 현저히 단축시킨 배열된부분집합 기대값최대화(ordered subsets expectation maximization. OSEM)에 Gibbs 선행치인 membrance (MM) 또는 thin plate(TP)을 첨가한 OSEM-MAP (maximum a posteriori)을 구현함으로써 알고리즘의 안정성 및 재구성된 영상의 질을 향상시키고자 g나다. 실험에서 알고리즘의 수렴시간을 가속화하기 위해 투사 데이터를 16개의 부분집합으로 분할하여 반복연산을 수행하였으며, 알고리즘의 성능을 비교하기 위해 소프트웨어 모형(원숭이 뇌 자가방사선, 수학적심장흉부)을 사용한 영상재구성 결과를 제곱오차로 비교하였다. 또한 알고리즘의 사용 가능성을 평가하기 위해 물리모형을 사용하여 PET 기기로부터 획득한 실제 투사 데이터를 사용하였다.

  • PDF

Product Recommender Systems using Multi-Model Ensemble Techniques (다중모형조합기법을 이용한 상품추천시스템)

  • Lee, Yeonjeong;Kim, Kyoung-Jae
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.39-54
    • /
    • 2013
  • Recent explosive increase of electronic commerce provides many advantageous purchase opportunities to customers. In this situation, customers who do not have enough knowledge about their purchases, may accept product recommendations. Product recommender systems automatically reflect user's preference and provide recommendation list to the users. Thus, product recommender system in online shopping store has been known as one of the most popular tools for one-to-one marketing. However, recommender systems which do not properly reflect user's preference cause user's disappointment and waste of time. In this study, we propose a novel recommender system which uses data mining and multi-model ensemble techniques to enhance the recommendation performance through reflecting the precise user's preference. The research data is collected from the real-world online shopping store, which deals products from famous art galleries and museums in Korea. The data initially contain 5759 transaction data, but finally remain 3167 transaction data after deletion of null data. In this study, we transform the categorical variables into dummy variables and exclude outlier data. The proposed model consists of two steps. The first step predicts customers who have high likelihood to purchase products in the online shopping store. In this step, we first use logistic regression, decision trees, and artificial neural networks to predict customers who have high likelihood to purchase products in each product group. We perform above data mining techniques using SAS E-Miner software. In this study, we partition datasets into two sets as modeling and validation sets for the logistic regression and decision trees. We also partition datasets into three sets as training, test, and validation sets for the artificial neural network model. The validation dataset is equal for the all experiments. Then we composite the results of each predictor using the multi-model ensemble techniques such as bagging and bumping. Bagging is the abbreviation of "Bootstrap Aggregation" and it composite outputs from several machine learning techniques for raising the performance and stability of prediction or classification. This technique is special form of the averaging method. Bumping is the abbreviation of "Bootstrap Umbrella of Model Parameter," and it only considers the model which has the lowest error value. The results show that bumping outperforms bagging and the other predictors except for "Poster" product group. For the "Poster" product group, artificial neural network model performs better than the other models. In the second step, we use the market basket analysis to extract association rules for co-purchased products. We can extract thirty one association rules according to values of Lift, Support, and Confidence measure. We set the minimum transaction frequency to support associations as 5%, maximum number of items in an association as 4, and minimum confidence for rule generation as 10%. This study also excludes the extracted association rules below 1 of lift value. We finally get fifteen association rules by excluding duplicate rules. Among the fifteen association rules, eleven rules contain association between products in "Office Supplies" product group, one rules include the association between "Office Supplies" and "Fashion" product groups, and other three rules contain association between "Office Supplies" and "Home Decoration" product groups. Finally, the proposed product recommender systems provides list of recommendations to the proper customers. We test the usability of the proposed system by using prototype and real-world transaction and profile data. For this end, we construct the prototype system by using the ASP, Java Script and Microsoft Access. In addition, we survey about user satisfaction for the recommended product list from the proposed system and the randomly selected product lists. The participants for the survey are 173 persons who use MSN Messenger, Daum Caf$\acute{e}$, and P2P services. We evaluate the user satisfaction using five-scale Likert measure. This study also performs "Paired Sample T-test" for the results of the survey. The results show that the proposed model outperforms the random selection model with 1% statistical significance level. It means that the users satisfied the recommended product list significantly. The results also show that the proposed system may be useful in real-world online shopping store.

Developing a Traffic Accident Prediction Model for Freeways (고속도로 본선에서의 교통사고 예측모형 개발)

  • Mun, Sung-Ra;Lee, Young-Ihn;Lee, Soo-Beom
    • Journal of Korean Society of Transportation
    • /
    • v.30 no.2
    • /
    • pp.101-116
    • /
    • 2012
  • Accident prediction models have been utilized to predict accident possibilities in existing or projected freeways and to evaluate programs or policies for improving safety. In this study, a traffic accident prediction model for freeways was developed for the above purposes. When selecting variables for the model, the highest priority was on the ease of both collecting data and applying them into the model. The dependent variable was set as the number of total accidents and the number of accidents including casualties in the unit of IC(or JCT). As a result, two models were developed; the overall accident model and the casualty-related accident model. The error structure adjusted to each model was the negative binomial distribution and the Poisson distribution, respectively. Among the two models, a more appropriate model was selected by statistical estimation. Major nine national freeways were selected and five-year dada of 2003~2007 were utilized. Explanatory variables should take on either a predictable value such as traffic volumes or a fixed value with respect to geometric conditions. As a result of the Maximum Likelihood estimation, significant variables of the overall accident model were found to be the link length between ICs(or JCTs), the daily volumes(AADT), and the ratio of bus volume to the number of curved segments between ICs(or JCTs). For the casualty-related accident model, the link length between ICs(or JCTs), the daily volumes(AADT), and the ratio of bus volumes had a significant impact on the accident. The likelihood ratio test was conducted to verify the spatial and temporal transferability for estimated parameters of each model. It was found that the overall accident model could be transferred only to the road with four or more than six lanes. On the other hand, the casualty-related accident model was transferrable to every road and every time period. In conclusion, the model developed in this study was able to be extended to various applications to establish future plans and evaluate policies.

Estimation of Variance Component and Environment Effects on Somatic Cell Scores by Parity in Dairy Cattle (젖소집단의 산차에 따른 체세포점수의 환경효과 및 분산성분 추정)

  • 조광현;나승환;서강석;김시동;박병호;이영창;박종대;손삼규;최재관
    • Journal of Animal Science and Technology
    • /
    • v.48 no.1
    • /
    • pp.39-48
    • /
    • 2006
  • This study utilized test day of somatic cell score data of dairy cattle from 2000 to 2004. The number of data used were 124,635 of first parity, 134,308 of second parity, 77,862 of third parity, 41,787 of forth parity and 37,412 of fifth parity. The data was analyzed by least square mean method using GLM to estimate the effects of calving year, age, lactation stage, parity and season on somatic cell score. Variance component estimation using test day model was determined by using expectation maximization algorithm- restricted maximum likelihood (EM-REML) analysis method. In each parity, somatic cell score was low for younger group and was relatively high in older groups. Likewise, for lactation stage, the score was low in early-lactation and high in late-lactation in first parity and second parity. Nevertheless, for the third, fourth and fifth parity, however, high somatic cell score was observed in mid-lactation. Generally, the score was high in the peak. Although in fourth and fifth parity, the score was low in late-lactation. Environmental effect of season, somatic cell score was generally low from September to November for all parities. The score was high between June and August when the milk production is usually low. The heritability in each parity were 0.05, 0.09, 0.10, 0.05 and 0.05 for parity 1, 2, 3, 4, 5, respectively. Genetic variance value was estimated to be high in second, third and fifth parity in early-lactation and to be low in first and forth parity.

Analysis of Consumer Preferences for Wine (국산 포도주 개발을 위한 소비자 선호분석)

  • Park, Eun-Kyung;Ryu, Jin-Chun;Kim, Tae-Kyun
    • Food Science and Preservation
    • /
    • v.17 no.3
    • /
    • pp.418-424
    • /
    • 2010
  • Although the wine industry continues to grow, little empirical research on consumer preferences has been conducted. Thus, our objective was to analyze consumer views on wine attributes. A choice experiment (CE) was designed to detect a marginal willingness to pay for particular characteristics of wine (balance, flavor, color, clarity, and value-for-money). A questionnaire was administered and 286 responses were received. A multinomial logit model was estimated using the maximum likelihood method. The results indicated that balance, flavor, color, clarity, and price were all important to consumers. The CE data revealed that estimates of marginal willingness to pay were 31,899 won/bottle for balance, 23,088 won/bottle for flavor, 3,230 won/bottle for color, and 25,936 won/bottle for clarity. The balance of a wine was most important, and the flavor, clarity, and color were also significant. The results of this work will be of assistance in promoting the domestic wine industry.

A Software Reliability Cost Model Based on the Shape Parameter of Lomax Distribution (Lomax 분포의 형상모수에 근거한 소프트웨어 신뢰성 비용모형에 관한 연구)

  • Yang, Tae-Jin
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.2
    • /
    • pp.171-177
    • /
    • 2016
  • Software reliability in the software development process is an important issue. Software process improvement helps in finishing with reliable software product. Infinite failure NHPP software reliability models presented in the literature exhibit either constant, monotonic increasing or monotonic decreasing failure occurrence rates per fault. In this study, reliability software cost model considering shape parameter based on life distribution from the process of software product testing was studied. The cost comparison problem of the Lomax distribution reliability growth model that is widely used in the field of reliability presented. The software failure model was used the infinite failure non-homogeneous Poisson process model. The parameters estimation using maximum likelihood estimation was conducted. For analysis of software cost model considering shape parameter. In the process of change and large software fix this situation can scarcely avoid the occurrence of defects is reality. The conditions that meet the reliability requirements and to minimize the total cost of the optimal release time. Studies comparing emissions when analyzing the problem to help kurtosis So why Kappa efficient distribution, exponential distribution, etc. updated in terms of the case is considered as also worthwhile. In this research, software developers to identify software development cost some extent be able to help is considered.