• Title/Summary/Keyword: statistics based method

Search Result 2,144, Processing Time 0.027 seconds

ON THEIL'S METHOD IN FUZZY LINEAR REGRESSION MODELS

  • Choi, Seung Hoe;Jung, Hye-Young;Lee, Woo-Joo;Yoon, Jin Hee
    • Communications of the Korean Mathematical Society
    • /
    • v.31 no.1
    • /
    • pp.185-198
    • /
    • 2016
  • Regression analysis is an analyzing method of regression model to explain the statistical relationship between explanatory variable and response variables. This paper propose a fuzzy regression analysis applying Theils method which is not sensitive to outliers. This method use medians of rate of increment based on randomly chosen pairs of each components of ${\alpha}$-level sets of fuzzy data in order to estimate the coefficients of fuzzy regression model. An example and two simulation results are given to show fuzzy Theils estimator is more robust than the fuzzy least squares estimator.

Finding the Maximum Flow in a Network with Simple Paths

  • Lee, Seung-Min;Lee, Chong-Hyung;Park, Dong-Ho
    • Communications for Statistical Applications and Methods
    • /
    • v.9 no.3
    • /
    • pp.845-851
    • /
    • 2002
  • An efficient method is developed to obtain the maximum flow for a network when its simple paths are known. Most of the existing techniques need to convert simple paths into minimal cuts, or to determine the order of simple paths to be applied in the process to reach the correct result. In this paper, we propose a method based on the concepts of signed simple path and signed flow defined in the text. Our method involves a fewer number of arithmetic operations at each iteration, and requires fewer iterations in the whole process than the existing methods. Our method can be easily extended to a mixed network with a slight modification. Furthermore, the correctness of our method does not depend on the order of simple paths to be applied in the process.

Discriminant analysis using empirical distribution function

  • Kim, Jae Young;Hong, Chong Sun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.5
    • /
    • pp.1179-1189
    • /
    • 2017
  • In this study, we propose an alternative method for discriminant analysis using a multivariate empirical distribution function to express multivariate data as a simple one-dimensional statistic. This method turns to be the estimation process of the optimal threshold based on classification accuracy measures and an empirical distribution function of data composed of classes. This can also be visually represented on a two-dimensional plane and discussed with some measures in ROC curves, surfaces, and manifolds. In order to explore the usefulness of this method for discriminant analysis in the study, we conducted comparisons between the proposed method and the existing methods through simulations and illustrative examples. It is found that the proposed method may have better performances for some cases.

A Hilbert-Huang Transform Approach Combined with PCA for Predicting a Time Series

  • Park, Min-Jeong
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.6
    • /
    • pp.995-1006
    • /
    • 2011
  • A time series can be decomposed into simple components with a multiscale method. Empirical mode decomposition(EMD) is a recently invented multiscale method in Huang et al. (1998). It is natural to apply a classical prediction method such a vector autoregressive(AR) model to the obtained simple components instead of the original time series; in addition, a prediction procedure combining a classical prediction model to EMD and Hilbert spectrum is proposed in Kim et al. (2008). In this paper, we suggest to adopt principal component analysis(PCA) to the prediction procedure that enables the efficient selection of input variables among obtained components by EMD. We discuss the utility of adopting PCA in the prediction procedure based on EMD and Hilbert spectrum and analyze the daily worm account data by the proposed PCA adopted prediction method.

Introduction to Gene Prediction Using HMM Algorithm

  • Kim, Keon-Kyun;Park, Eun-Sik
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.489-506
    • /
    • 2007
  • Gene structure prediction, which is to predict protein coding regions in a given nucleotide sequence, is the most important process in annotating genes and greatly affects gene analysis and genome annotation. As eukaryotic genes have more complicated structures in DNA sequences than those of prokaryotic genes, analysis programs for eukaryotic gene structure prediction have more diverse and more complicated computational models. There are Ab Initio method, Similarity-based method, and Ensemble method for gene prediction method for eukaryotic genes. Each Method use various algorithms. This paper introduce how to predict genes using HMM(Hidden Markov Model) algorithm and present the process of gene prediction with well-known gene prediction programs.

  • PDF

On Finding the Maximum Capacity Flow in Networks

  • Lee, Chong-Hyung;Park, Dong-Ho;Lee, Seung-Min
    • Proceedings of the Korean Reliability Society Conference
    • /
    • 2002.06a
    • /
    • pp.297-302
    • /
    • 2002
  • An efficient method is developed to obtain the maximum capacity flow for a network when its simple paths are known. Most of the existing techniques need to convert simple paths into minimal cuts, or to determine the order of simple paths to be applied in the process to reach the correct result. In this paper, we propose a method based on the concepts of signed simple path and signed flow defined in the text. Our method involves a fewer number of arithmetic operations at each iteration, and requires fewer iterations in the whole process than the existing methods. Our method can be easily extended to a mixed network with a slight modification. Furthermore, the correctness of our method does not depend on the order of simple paths to be applied in the process.

  • PDF

Cointegration Analysis with Mixed-Frequency Data of Quarterly GDP and Monthly Coincident Indicators

  • Seong, Byeongchan
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.925-932
    • /
    • 2012
  • The article introduces a method to estimate a cointegrated vector autoregressive model, using mixed-frequency data, in terms of a state-space representation of the vector error correction(VECM) of the model. The method directly estimates the parameters of the model, in a state-space form of its VECM representation, using the available data in its mixed-frequency form. Then it allows one to compute in-sample smoothed estimates and out-of-sample forecasts at their high-frequency intervals using the estimated model. The method is applied to a mixed-frequency data set that consists of the quarterly real gross domestic product and three monthly coincident indicators. The result shows that the method produces accurate smoothed and forecasted estimates in comparison to a method based on single-frequency data.

A Penalized Spline Based Method for Detecting the DNA Copy Number Alteration in an Array-CGH Experiment

  • Kim, Byung-Soo;Kim, Sang-Cheol
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.1
    • /
    • pp.115-127
    • /
    • 2009
  • The purpose of statistical analyses of array-CGH experiment data is to divide the whole genome into regions of equal copy number, to quantify the copy number in each region and finally to evaluate its significance of being different from two. Several statistical procedures have been proposed which include the circular binary segmentation, and a Gaussian based local regression for detecting break points (GLAD) by estimating a piecewise constant function. We propose in this note a penalized spline regression and its simultaneous confidence band(SCB) approach to evaluate the statistical significance of regions of genetic gain/loss. The region of which the simultaneous confidence band stays above 0 or below 0 can be considered as a region of genetic gain or loss. We compare the performance of the SCB procedure with GLAD and hidden Markov model approaches through a simulation study in which the data were generated from AR(1) and AR(2) models to reflect spatial dependence of the array-CGH data in addition to the independence model. We found that the SCB method is more sensitive in detecting the low level copy number alterations.

A data-adaptive maximum penalized likelihood estimation for the generalized extreme value distribution

  • Lee, Youngsaeng;Shin, Yonggwan;Park, Jeong-Soo
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.5
    • /
    • pp.493-505
    • /
    • 2017
  • Maximum likelihood estimation (MLE) of the generalized extreme value distribution (GEVD) is known to sometimes over-estimate the positive value of the shape parameter for the small sample size. The maximum penalized likelihood estimation (MPLE) with Beta penalty function was proposed by some researchers to overcome this problem. But the determination of the hyperparameters (HP) in Beta penalty function is still an issue. This paper presents some data adaptive methods to select the HP of Beta penalty function in the MPLE framework. The idea is to let the data tell us what HP to use. For given data, the optimal HP is obtained from the minimum distance between the MLE and MPLE. A bootstrap-based method is also proposed. These methods are compared with existing approaches. The performance evaluation experiments for GEVD by Monte Carlo simulation show that the proposed methods work well for bias and mean squared error. The methods are applied to Blackstone river data and Korean heavy rainfall data to show better performance over MLE, the method of L-moments estimator, and existing MPLEs.

A Bayesian Prediction of the Generalized Pareto Model (일반화 파레토 모형에서의 베이지안 예측)

  • Huh, Pan;Sohn, Joong Kweon
    • The Korean Journal of Applied Statistics
    • /
    • v.27 no.6
    • /
    • pp.1069-1076
    • /
    • 2014
  • Rainfall weather patterns have changed due to global warming and sudden heavy rainfalls have become more frequent. Economic loss due to heavy rainfall has increased. We study the generalized Pareto distribution for modelling rainfall in Seoul based on data from 1973 to 2008. We use several priors including Jeffrey's noninformative prior and Gibbs sampling method to derive Bayesian posterior predictive distributions. The probability of heavy rainfall has increased over the last ten years based on estimated posterior predictive distribution.