• Title/Summary/Keyword: statistics techniques

Search Result 794, Processing Time 0.021 seconds

Comparative Study on Statistical Packages for using Multivariate Q-technique

  • Choi, Yong-Seok;Moon, Hee-jung
    • Communications for Statistical Applications and Methods
    • /
    • v.10 no.2
    • /
    • pp.433-443
    • /
    • 2003
  • In this study, we provide a comparison of multivariate Q-techniques in the up-to-date versions of SAS, SPSS, Minitab and S-plus well known to those who study statistics. We can analyze data through the direct Input method(command) in SAS and use of menu method in SPSS, Minitab and S-plus. The analysis performance method is chosen by the high frequency of use. Widely we compare with each Q-techniques form according to input data, input option, statistical chart and statistical output.

A comparative study of the Gini coefficient estimators based on the regression approach

  • Mirzaei, Shahryar;Borzadaran, Gholam Reza Mohtashami;Amini, Mohammad;Jabbari, Hadi
    • Communications for Statistical Applications and Methods
    • /
    • v.24 no.4
    • /
    • pp.339-351
    • /
    • 2017
  • Resampling approaches were the first techniques employed to compute a variance for the Gini coefficient; however, many authors have shown that an analysis of the Gini coefficient and its corresponding variance can be obtained from a regression model. Despite the simplicity of the regression approach method to compute a standard error for the Gini coefficient, the use of the proposed regression model has been challenging in economics. Therefore in this paper, we focus on a comparative study among the regression approach and resampling techniques. The regression method is shown to overestimate the standard error of the Gini index. The simulations show that the Gini estimator based on the modified regression model is also consistent and asymptotically normal with less divergence from normal distribution than other resampling techniques.

Application of EDA Techniques for Estimating Rainfall Quantiles (확률강우량 산정을 위한 EDA 기법의 적용)

  • Park, Hyunkeun;Oh, Sejeong;Yoo, Chulsang
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.29 no.4B
    • /
    • pp.319-328
    • /
    • 2009
  • This study quantified the data by applying the EDA techniques considering the data structure, and the results were then used for the frequency analysis. Although traditional methods based on the method of moments provide very sensitive statistics to the extreme values, the EDA techniques have an advantage of providing very stable statistics with their small variation. For the application of the EDA techniques to the frequency analysis, it is necessary to normalization transform and inverse-transform to conserve the skewness of the raw data. That is, it is necessary to transform the raw data to make the data follow the normal distribution, to estimate the statistics by applying the EDA techniques, and then finally to inverse-transform the statistics of transformed data. These statistics decided are then applied for the frequency analysis with a given probability density function. This study analyzed the annual maxima one hour rainfall data at Seoul and Pohang stations. As a result, it was found that more stable rainfall quantiles, which were also less sensitive to extreme values, could be estimated by applying the EDA techniques. This methodology may be effectively used for the frequency analysis of rainfall at stations with especially high annual variations of rainfall due to climate change, etc.

Interpretation of Real Information-missing Patch of Remote Sensing Image with Kriging Interpolation of Spatial Statistics

  • Yiming, Feng;Xiangdong, Lei;Yuanchang, Lu
    • Proceedings of the KSRS Conference
    • /
    • 2003.11a
    • /
    • pp.1479-1481
    • /
    • 2003
  • The aim of this paper was mainly to interpret the real information-missing patch of image by using the kriging interpolation technology of spatial statistics. The TM Image of the Jingouling Forest Farm of Wangqing Forestry Bureau of Northeast China on 1 July 1997 was used as the tested material in this paper. Based on the classification for the TM image, the information pixel-missing patch of image was interpolated by the kriging interpolation technology of spatial statistics theory under the image treatment software-ERDAS and the geographic information system software-Arc/Info. The interpolation results were already passed precise examination. This paper would provide a method and means for interpreting the information-missing patch of image.

  • PDF

Periodization in the History of Statistics

  • Jo, Jae-Keun
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.1
    • /
    • pp.31-47
    • /
    • 2004
  • The history of statistics from the mid-seventeenth to the early twentieth century is considered and a scheme of periodization is proposed. In the first period(1650-1750), named 'the age of probability' in this paper, concept of probability emerged, and in the second period(1750-1820), named 'the age of error theory', statistical techniques such as the least square method are developed by astronomers and geodesists. Their techniques are supported theoretically by mathematicians like Laplace and Gauss in that period. The third period(1820-1880) is called 'the age of statistics(as a plural noun)' since statistical data played prominent roles in social sciences such as sociology, psychology. Finally the last period(1880- ), called 'the age of statistics(as a singular noun)', the discipline of statistics came to maturity both in theory and application.

Environmental Survey Data Modeling Using K-means Clustering Techniques

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.16 no.3
    • /
    • pp.557-566
    • /
    • 2005
  • Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. In this paper we used k-means clustering of several clustering techniques. The k-means Clustering Is classified as a partitional clustering method. We analyze 2002 Gyeongnam social indicator survey data using k-means clustering techniques for environmental information. We can use these outputs given by k-means clustering for environmental preservation and environmental improvement.

  • PDF

Environmental Survey Data Modeling using K-means Clustering Techniques

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2004.10a
    • /
    • pp.77-86
    • /
    • 2004
  • Clustering is the process of grouping the data into clusters so that objects within a cluster have high similarity in comparison to one another. In this paper we used k-means clustering of several clustering techniques. The k-means Clustering is classified as a partitional clustering method. We analyze 2002 Gyeongnam social indicator survey data using k-means clustering techniques for environmental information. We can use these outputs given by k-means clustering for environmental preservation and environmental improvement.

  • PDF

On the Performance of Cuckoo Search and Bat Algorithms Based Instance Selection Techniques for SVM Speed Optimization with Application to e-Fraud Detection

  • AKINYELU, Andronicus Ayobami;ADEWUMI, Aderemi Oluyinka
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.3
    • /
    • pp.1348-1375
    • /
    • 2018
  • Support Vector Machine (SVM) is a well-known machine learning classification algorithm, which has been widely applied to many data mining problems, with good accuracy. However, SVM classification speed decreases with increase in dataset size. Some applications, like video surveillance and intrusion detection, requires a classifier to be trained very quickly, and on large datasets. Hence, this paper introduces two filter-based instance selection techniques for optimizing SVM training speed. Fast classification is often achieved at the expense of classification accuracy, and some applications, such as phishing and spam email classifiers, are very sensitive to slight drop in classification accuracy. Hence, this paper also introduces two wrapper-based instance selection techniques for improving SVM predictive accuracy and training speed. The wrapper and filter based techniques are inspired by Cuckoo Search Algorithm and Bat Algorithm. The proposed techniques are validated on three popular e-fraud types: credit card fraud, spam email and phishing email. In addition, the proposed techniques are validated on 20 other datasets provided by UCI data repository. Moreover, statistical analysis is performed and experimental results reveals that the filter-based and wrapper-based techniques significantly improved SVM classification speed. Also, results reveal that the wrapper-based techniques improved SVM predictive accuracy in most cases.

Malware classification using statistical techniques (통계적 기법을 이용한 악성 소프트웨어 분류)

  • Won, Sungmin;Kim, Hyunjoo;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.851-865
    • /
    • 2017
  • Ransomware such as WannaCry is a global issue and methods to defend against malware attacks are important. We have to be able to classify the malware types efficiently in order to minimize the damage from malwares. This study makes models to classify malware properly with various statistical techniques. Several classification techniques such as logistic regression, random forest, gradient boosting, and support vector machine are used to construct models. This study also helps us understand key variables to classify the type of malicious software.

High Speed I/O Processing for Shared Memory Multiprocessor Systems (공유 메모리 다중 프로세서 시스템에서 고속 입출력 처리 기법)

  • 윤용호;임인칠
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.2
    • /
    • pp.19-32
    • /
    • 1993
  • This paper suggests the new high-speed input/output techniques in a shared memory multiprocessor system. The high-speed I/O processor which can connect the different kinds of large sized I/O periperal devices, the communication protocol to the main processing units for I/O operations, and the job scheduling scheme are addressed. This paper also introduces the disk cache technique which supports the slow I/O devices comparing with the main processing units. These techniques were implemented in the TICOM system. The performance evaluation statistics were collected and analyzed for the suggested high-speed I/O processing techniques. These statistics show the superiority of the suggested techniques.

  • PDF