• Title/Summary/Keyword: Data Compare

Search Result 8,085, Processing Time 0.041 seconds

Small Sample Study of Kernel Hazard Ratio Estimator

  • Choi, Myong-Hui
    • Journal of the Korean Data and Information Science Society
    • /
    • v.5 no.2
    • /
    • pp.59-74
    • /
    • 1994
  • The hazard ratio may be useful as a descriptive measure to compare the hazard experience of a treatment group with that of a control group. In this paper, we propose a kernel estimator of hazard ratio with censored survival data. The uniform consistency and asymptotic normality of the proposed estimator are proved by using counting process approach. In order to assess the performance of the proposed estimator, we compare the kernel estimator with Cox estimator and the generalized rank estimators of hazard ratio in terms of MSE by Monte Carlo simulation.

  • PDF

A Comparison of Capabilities of Data Mining Tools

  • Choi, Youn-Seok;Kim, Jong-Geoun;Lee, Jong-Hee
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.2
    • /
    • pp.531-541
    • /
    • 2001
  • In this study, we compare the capabilities of the data mining tools of the most updated version objectively and provide the useful information in which enterprises and universities chose them. In particular, we compare the SAS/Enterprise Miner 3.0, SPSS/Clementine 5.2 and IBM/Intelligent Miner 6.1 which are well known and easily gotten.

  • PDF

Comparative Analysis of 3D Spatial Data Models (3차원 공간정보 데이터 모델 비교 분석)

  • Park, Se-Ho;Lee, Ji-Yeong
    • Spatial Information Research
    • /
    • v.17 no.3
    • /
    • pp.277-285
    • /
    • 2009
  • Each system should have a suitable data model about their purpose for efficiently managing, analyzing, and manipulating data. And the usable range of application is determined by the data model, and suitable data models are being developed for each application. In GIS, diversity spatial data model is being developed too. The accuracy and update of the spatial data would be important for applying efficient application as well as the data modeling is important as constructing the spatial data structure. Therefore, the purposes of this research are to 1)compare domestic spatial data models with oversea spatial data models about their geometry model, topology model and visualizing method of 3D spatial data 2)to compare the features of the data model by analyzing each data structures. We 3)compare and analyze features of each spatial data models via the quantitative analysis of each spatial data models.

  • PDF

Performance Comparison of Logistic Regression Algorithms on RHadoop

  • Jung, Byung Ho;Lim, Dong Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.9-16
    • /
    • 2017
  • Machine learning has found widespread implementations and applications in many different domains in our life. Logistic regression is a type of classification in machine leaning, and is used widely in many fields, including medicine, economics, marketing and social sciences. In this paper, we present the MapReduce implementation of three existing algorithms, this is, Gradient Descent algorithm, Cost Minimization algorithm and Newton-Raphson algorithm, for logistic regression on RHadoop that integrates R and Hadoop environment applicable to large scale data. We compare the performance of these algorithms for estimation of logistic regression coefficients with real and simulated data sets. We also compare the performance of our RHadoop and RHIPE platforms. The performance experiments showed that our Newton-Raphson algorithm when compared to Gradient Descent and Cost Minimization algorithms appeared to be better to all data tested, also showed that our RHadoop was better than RHIPE in real data, and was opposite in simulated data.

Comparison of Binary Discretization Algorithms for Data Mining

  • Na, Jong-Hwa;Kim, Jeong-Mi;Cho, Wan-Sup
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.4
    • /
    • pp.769-780
    • /
    • 2005
  • Recently, the discretization algorithms for continuous data have been actively studied. But there are few articles to compare the efficiency of these algorithms. In this paper we introduce the principles of some binary discretization algorithms including C4.5, CART and QUEST and investigate the efficiency of these algorithms through numerical study. For various underlying distribution, we compare these algorithms in view of misclassification rate and MSE. Real data examples are also included.

  • PDF

Recent deep learning methods for tabular data

  • Yejin Hwang;Jongwoo Song
    • Communications for Statistical Applications and Methods
    • /
    • v.30 no.2
    • /
    • pp.215-226
    • /
    • 2023
  • Deep learning has made great strides in the field of unstructured data such as text, images, and audio. However, in the case of tabular data analysis, machine learning algorithms such as ensemble methods are still better than deep learning. To keep up with the performance of machine learning algorithms with good predictive power, several deep learning methods for tabular data have been proposed recently. In this paper, we review the latest deep learning models for tabular data and compare the performances of these models using several datasets. In addition, we also compare the latest boosting methods to these deep learning methods and suggest the guidelines to the users, who analyze tabular datasets. In regression, machine learning methods are better than deep learning methods. But for the classification problems, deep learning methods perform better than the machine learning methods in some cases.

Estimations in a skewed uniform distribution

  • Son, Hee-Ju;Woo, Jung-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.20 no.4
    • /
    • pp.733-740
    • /
    • 2009
  • We obtain a skewed uniform distribution by a uniform distribution, and evaluate its coeffcient of skewness. And we obtain the approximate maximum likelihood estimator (AML) and moment estimator of skew parameter in the skewed uniform distribution. And we compare simulated mean squared errors (MSE) of those estimators, and also compare MSE of two proposed reliability estimators in two independent skewed uniform distributions each with different skew parameters.

  • PDF

Estimation for the Half-Triangle Distribution Based on Progressively Type-II Censored Samples

  • Han, Jun-Tae;Kang, Suk-Bok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.3
    • /
    • pp.951-957
    • /
    • 2008
  • We derive some approximate maximum likelihood estimators(AMLEs) and maximum likelihood estimator(MLE) of the scale parameter in the half-triangle distribution based on progressively Type-II censored samples. We compare the proposed estimators in the sense of the mean squared error for various censored samples. We also obtain the approximate maximum likelihood estimators of the reliability function using the proposed estimators. We compare the proposed estimators in the sense of the mean squared error.

  • PDF

The Approximate MLE in a Skew-Symmetric Laplace Distribution

  • Son, Hee-Ju;Woo, Jung-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.2
    • /
    • pp.573-584
    • /
    • 2007
  • We define a skew-symmetric Laplace distribution by a symmetric Laplace distribution and evaluate its coefficient of skewness. And we derive an approximate maximum likelihood estimator(AME) and a moment estimator(MME) of a skewed parameter in a skew-symmetric Laplace distribution, and hence compare simulated mean squared errors of those estimators. We compare asymptotic mean squared errors of two defined estimators of reliability in two independent skew-symmetric distributions.

  • PDF

Bootstrap and Delete-d Jackknife Confidence Intervals for Parameters of an Exponential Distribution

  • Kang, Suk-Bok;Cho, Young-Suk
    • Journal of the Korean Data and Information Science Society
    • /
    • v.8 no.1
    • /
    • pp.59-70
    • /
    • 1997
  • We introduce several estimators of the location and the scale parameters of the two-parameter exponential distribution, and then compare these estimators by the mean square error (MSE). Using the parametric bootstrap estimators and the delete-d jackknife, we obtain the bootstrap and the delete-d jackknife confidence intervals for the location and the scale parameters and compare the bootstrap confidence intervals with the delete-d jackknife confidence intervals by length and coverage probability through Monte Carlo method.

  • PDF