• 제목/요약/키워드: Data Compare

검색결과 8,085건 처리시간 0.041초

Small Sample Study of Kernel Hazard Ratio Estimator

  • Choi, Myong-Hui
    • Journal of the Korean Data and Information Science Society
    • /
    • 제5권2호
    • /
    • pp.59-74
    • /
    • 1994
  • The hazard ratio may be useful as a descriptive measure to compare the hazard experience of a treatment group with that of a control group. In this paper, we propose a kernel estimator of hazard ratio with censored survival data. The uniform consistency and asymptotic normality of the proposed estimator are proved by using counting process approach. In order to assess the performance of the proposed estimator, we compare the kernel estimator with Cox estimator and the generalized rank estimators of hazard ratio in terms of MSE by Monte Carlo simulation.

  • PDF

A Comparison of Capabilities of Data Mining Tools

  • Choi, Youn-Seok;Kim, Jong-Geoun;Lee, Jong-Hee
    • Communications for Statistical Applications and Methods
    • /
    • 제8권2호
    • /
    • pp.531-541
    • /
    • 2001
  • In this study, we compare the capabilities of the data mining tools of the most updated version objectively and provide the useful information in which enterprises and universities chose them. In particular, we compare the SAS/Enterprise Miner 3.0, SPSS/Clementine 5.2 and IBM/Intelligent Miner 6.1 which are well known and easily gotten.

  • PDF

3차원 공간정보 데이터 모델 비교 분석 (Comparative Analysis of 3D Spatial Data Models)

  • 박세호;이지영
    • Spatial Information Research
    • /
    • 제17권3호
    • /
    • pp.277-285
    • /
    • 2009
  • 효율적으로 데이터를 관리, 분석, 유지하기 위해서는 각 시스템의 목적에 맞는 데이터 모델이 필요하다. 데이터 모델에 따라 해당하는 활용 시스템의 활용 범위가 결정되며, 각각의 활용 시스템에 맞는 데이터 모델이 개발되고 있는 상황이다. GIS 분야에서도 각 GIS 응용시스템에 맞는 다양한 공간정보 데이터 모델들이 개발 되었으며, 제공하고자 하는 서비스에 따라 공간정보 데이터 모델이 만들어지고 있다. 어플리케이션의 효율적인 활용을 위해서는 공간정보 데이터의 정확성과 최신성등이 중요하지만 특히 공간정보 데이터 구조를 만드는 데이터 모델링이 중요하다. 그러므로 본 연구는 1)국내외 공간정보 데이터 모델의 공간정보를 표현하는데 있어 기하학적 모델, 위상학적 모델과 3차원 공간정보 가시화 방법 등의 항목별로 비교하고 2)각각의 데이터 구조를 분석하여 데이터 모델의 특징을 비교한다. 마지막으로 3)공간정보 데이터 모델을 선정하여 정량적인 분석을 통해 데이터 구조에 따른 특징을 분석한다.

  • PDF

Performance Comparison of Logistic Regression Algorithms on RHadoop

  • Jung, Byung Ho;Lim, Dong Hoon
    • 한국컴퓨터정보학회논문지
    • /
    • 제22권4호
    • /
    • pp.9-16
    • /
    • 2017
  • Machine learning has found widespread implementations and applications in many different domains in our life. Logistic regression is a type of classification in machine leaning, and is used widely in many fields, including medicine, economics, marketing and social sciences. In this paper, we present the MapReduce implementation of three existing algorithms, this is, Gradient Descent algorithm, Cost Minimization algorithm and Newton-Raphson algorithm, for logistic regression on RHadoop that integrates R and Hadoop environment applicable to large scale data. We compare the performance of these algorithms for estimation of logistic regression coefficients with real and simulated data sets. We also compare the performance of our RHadoop and RHIPE platforms. The performance experiments showed that our Newton-Raphson algorithm when compared to Gradient Descent and Cost Minimization algorithms appeared to be better to all data tested, also showed that our RHadoop was better than RHIPE in real data, and was opposite in simulated data.

Comparison of Binary Discretization Algorithms for Data Mining

  • Na, Jong-Hwa;Kim, Jeong-Mi;Cho, Wan-Sup
    • Journal of the Korean Data and Information Science Society
    • /
    • 제16권4호
    • /
    • pp.769-780
    • /
    • 2005
  • Recently, the discretization algorithms for continuous data have been actively studied. But there are few articles to compare the efficiency of these algorithms. In this paper we introduce the principles of some binary discretization algorithms including C4.5, CART and QUEST and investigate the efficiency of these algorithms through numerical study. For various underlying distribution, we compare these algorithms in view of misclassification rate and MSE. Real data examples are also included.

  • PDF

Recent deep learning methods for tabular data

  • Yejin Hwang;Jongwoo Song
    • Communications for Statistical Applications and Methods
    • /
    • 제30권2호
    • /
    • pp.215-226
    • /
    • 2023
  • Deep learning has made great strides in the field of unstructured data such as text, images, and audio. However, in the case of tabular data analysis, machine learning algorithms such as ensemble methods are still better than deep learning. To keep up with the performance of machine learning algorithms with good predictive power, several deep learning methods for tabular data have been proposed recently. In this paper, we review the latest deep learning models for tabular data and compare the performances of these models using several datasets. In addition, we also compare the latest boosting methods to these deep learning methods and suggest the guidelines to the users, who analyze tabular datasets. In regression, machine learning methods are better than deep learning methods. But for the classification problems, deep learning methods perform better than the machine learning methods in some cases.

Estimations in a skewed uniform distribution

  • Son, Hee-Ju;Woo, Jung-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제20권4호
    • /
    • pp.733-740
    • /
    • 2009
  • We obtain a skewed uniform distribution by a uniform distribution, and evaluate its coeffcient of skewness. And we obtain the approximate maximum likelihood estimator (AML) and moment estimator of skew parameter in the skewed uniform distribution. And we compare simulated mean squared errors (MSE) of those estimators, and also compare MSE of two proposed reliability estimators in two independent skewed uniform distributions each with different skew parameters.

  • PDF

Estimation for the Half-Triangle Distribution Based on Progressively Type-II Censored Samples

  • Han, Jun-Tae;Kang, Suk-Bok
    • Journal of the Korean Data and Information Science Society
    • /
    • 제19권3호
    • /
    • pp.951-957
    • /
    • 2008
  • We derive some approximate maximum likelihood estimators(AMLEs) and maximum likelihood estimator(MLE) of the scale parameter in the half-triangle distribution based on progressively Type-II censored samples. We compare the proposed estimators in the sense of the mean squared error for various censored samples. We also obtain the approximate maximum likelihood estimators of the reliability function using the proposed estimators. We compare the proposed estimators in the sense of the mean squared error.

  • PDF

The Approximate MLE in a Skew-Symmetric Laplace Distribution

  • Son, Hee-Ju;Woo, Jung-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • 제18권2호
    • /
    • pp.573-584
    • /
    • 2007
  • We define a skew-symmetric Laplace distribution by a symmetric Laplace distribution and evaluate its coefficient of skewness. And we derive an approximate maximum likelihood estimator(AME) and a moment estimator(MME) of a skewed parameter in a skew-symmetric Laplace distribution, and hence compare simulated mean squared errors of those estimators. We compare asymptotic mean squared errors of two defined estimators of reliability in two independent skew-symmetric distributions.

  • PDF

Bootstrap and Delete-d Jackknife Confidence Intervals for Parameters of an Exponential Distribution

  • Kang, Suk-Bok;Cho, Young-Suk
    • Journal of the Korean Data and Information Science Society
    • /
    • 제8권1호
    • /
    • pp.59-70
    • /
    • 1997
  • We introduce several estimators of the location and the scale parameters of the two-parameter exponential distribution, and then compare these estimators by the mean square error (MSE). Using the parametric bootstrap estimators and the delete-d jackknife, we obtain the bootstrap and the delete-d jackknife confidence intervals for the location and the scale parameters and compare the bootstrap confidence intervals with the delete-d jackknife confidence intervals by length and coverage probability through Monte Carlo method.

  • PDF