• Title/Summary/Keyword: Mahalanobis

Search Result 180, Processing Time 0.031 seconds

Detecting outliers in multivariate data and visualization-R scripts (다변량 자료에서 특이점 검출 및 시각화 - R 스크립트)

  • Kim, Sung-Soo
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.4
    • /
    • pp.517-528
    • /
    • 2018
  • We provide R scripts to detect outliers in multivariate data and visualization. Detecting outliers is provided using three approaches 1) Robust Mahalanobis distance, 2) High Dimensional data, 3) density-based approach methods. We use the following techniques to visualize detected potential outliers 1) multidimensional scaling (MDS) and minimal spanning tree (MST) with k-means clustering, 2) MDS with fviz cluster, 3) principal component analysis (PCA) with fviz cluster. For real data sets, we use MLB pitching data including Ryu, Hyun-jin in 2013 and 2014. The developed R scripts can be downloaded at "http://www.knou.ac.kr/~sskim/ddpoutlier.html" (R scripts and also R package can be downloaded here).

Illumination estimation based on valid pixel selection from CCD camera response (CCD카메라 응답으로부터 유효 화소 선택에 기반한 광원 추정)

  • 권오설;조양호;김윤태;송근호;하영호
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.41 no.5
    • /
    • pp.251-258
    • /
    • 2004
  • This paper proposes a method for estimating the illuminant chromaticity using the distributions of the camera responses obtained by a CCD camera in a real-world scene. Illuminant estimation using a highlight method is based on the geometric relation between a body and its surface reflection. In general, the pixels in a highlight region are affected by an illuminant geometric difference, camera quantization errors, and the non-uniformity of the CCD sensor. As such, this leads to inaccurate results if an illuminant is estimated using the pixels of a CCD camera without any preprocessing. Accordingly, to solve this problem the proposed method analyzes the distribution of the CCD camera responses and selects pixels using the Mahalanobis distance in highlight regions. The use of the Mahalanobis distance based on the camera responses enables the adaptive selection of valid pixels among the pixels distributed in the highlight regions. Lines are then determined based on the selected pixels with r-g chromaticity coordinates using a principal component analysis(PCA). Thereafter, the illuminant chromaticity is estimated based on the intersection points of the lines. Experimental results using the proposed method demonstrated a reduced estimation error compared with the conventional method.

Mutivariate Analysis on Quantitative Characteristics of Prunus mume (매실의 다변량에 의한 양적 형질 분석)

  • Choi, Gab Lim;Hyun, Kyu-Hwan;Shin, Dong Young
    • Korean Journal of Plant Resources
    • /
    • v.27 no.1
    • /
    • pp.89-94
    • /
    • 2014
  • Varietal distances were measured by Mahalanobis's $D^2$ statistics in 190 possible comparisons among twenty varieties of Prunus mume with twelve characters such as seed weight, length, width, and diameter, fruit weight, and number of sepals, petals, pistils, and stigmas, and leaf length and width. A complete linkage cluster analysis based on the Mahalanobis's distance ($D^2$) was attempt. Twenty varieties of Prunus mume were largely classified into five subgroups. Group I, II, III, IV and V included two, four, five, five and four varieties, respectively. Most of the varietal groups were not associated with their geographical origins. Number of stigmas, and leaf length and width among the twelve characters were the largest contributors to the $D^2$ in both intra-and inter groups.

VoIP-Based Voice Secure Telecommunication Using Speaker Authentication in Telematics Environments (텔레매틱스 환경에서 화자인증을 이용한 VoIP기반 음성 보안통신)

  • Kim, Hyoung-Gook;Shin, Dong
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.10 no.1
    • /
    • pp.84-90
    • /
    • 2011
  • In this paper, a VoIP-based voice secure telecommunication technology using the text-independent speaker authentication in the telematics environments is proposed. For the secure telecommunication, the sender's voice packets are encrypted by the public-key generated from the speaker's voice information and submitted to the receiver. It is constructed to resist against the man-in-the middle attack. At the receiver side, voice features extracted from the received voice packets are compared with the reference voice-key received from the sender side for the speaker authentication. To improve the accuracy of text-independent speaker authentication, Gaussian Mixture Model(GMM)-supervectors are applied to Support Vector Machine (SVM) kernel using Bayesian information criterion (BIC) and Mahalanobis distance (MD).

Sound Quality Evaluation and Grade Construction of the Level D Noise for the Vehicle Using MTS (MTS기법을 이용한 차량 D단 소음의 음질 평가 및 음질 등급화 구축)

  • Park, Sang-Gil;Park, Won-Sik;Sim, Hyoun-Jin;Lee, Jung-Youn;Oh, Jae-Eung
    • Transactions of the Korean Society for Noise and Vibration Engineering
    • /
    • v.18 no.4
    • /
    • pp.393-399
    • /
    • 2008
  • The reduction of the Vehicle interior noise has been the main interest of NVH engineers. The driver's perception on the vehicle noise is affected largely by psychoacoustic characteristic of the noise as well as the SPL. The previous methods to evaluation of the SQ about vehicle interior noise are linear regression analysis of subjective SQ metrics by statistics and the estimation of the subjective SQ values by neural network. But these are so depended on jury test very much that they result in many difficulties. So, to reduce jury test weight, we suggested a new method using Mahalanobis distance for SQ evaluation. And, optimal characteristic values influenced on the result of the SQ evaluation were derived by signal to noise ratio(SN ratio) of the Taguchi method. Finally, the new method to evaluate SQ is constructed using Mahalanobis-Taguchi system(MTS). Furthermore, the MTS method for SQ evaluation was compared by the result of SQ grade table at the previous study and their virtues and faults introduced.

Relational Discriminant Analysis Using Prototype Reduction Schemes and Mahalanobis Distances (Prototype Reduction Schemes와 Mahalanobis 거리를 이용한 Relational Discriminant Analysis)

  • Kim Sang-Woon
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.1 s.307
    • /
    • pp.9-16
    • /
    • 2006
  • RDA(Relational Discriminant Analysis) is a way of finding classifiers based on the dissimilarity measures among the prototypes extracted from feature vectors instead of the feature vectors themselves. Therefore, the accuracy of the RDA classifier is dependent on the methods of selecting prototypes and measuring proximities. In this paper we propose to utilize PRS(Prototype Reduction Schemes) and Mahalanobis distances to devise a method of increasing classification accuracies. Our experimental results demonstrate that the proposed mechanism increases the classification accuracy compared with the conventional approaches for samples involving real-life data sets as well as artificial data sets.

Extraction of water body in before and after images of flood using Mahalanobis distance-based spectral analysis

  • Ye, Chul-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.31 no.4
    • /
    • pp.293-302
    • /
    • 2015
  • Water body extraction is significant for flood disaster monitoring using satellite imagery. Conventional methods have focused on finding an index, which highlights water body and suppresses non-water body such as vegetation or soil area. The Normalized Difference Water Index (NDWI) is typically used to extract water body from satellite images. The drawback of NDWI, however, is that some man-made objects in built-up areas have NDWI values similar to water body. The objective of this paper is to propose a new method that could extract correctly water body with built-up areas in before and after images of flood. We first create a two-element feature vector consisting of NDWI and a Near InfRared band (NIR) and then select a training site on water body area. After computing the mean vector and the covariance matrix of the training site, we classify each pixel into water body based on Mahalanobis distance. We also register before and after images of flood using outlier removal and triangulation-based local transformation. We finally create a change map by combining the before-flooding water body and after-flooding water body. The experimental results show that the overall accuracy and Kappa coefficient of the proposed method were 97.25% and 94.14%, respectively, while those of the NDWI method were 89.5% and 69.6%, respectively.

Application of deterministic models for obtaining groundwater level distributions through outlier analysis

  • Dae-Hong Min;Saheed Mayowa Taiwo;Junghee Park;Sewon Kim;Hyung-Koo Yoon
    • Geomechanics and Engineering
    • /
    • v.35 no.5
    • /
    • pp.499-509
    • /
    • 2023
  • The objective of this study is to perform outlier analysis to obtain the distribution of groundwater levels through the best model. The groundwater levels are measured in 10, 25 and 30 piezometers in Seoul, Daejeon and Suncheon in South Korea. Fifty-eight empirical distribution functions were applied to determine a suitable fit for the measured groundwater levels. The best fitted models based on the measured values are determined as the Generalized Pareto distribution, the Johnson SB distribution and the Normal distribution for Seoul, Daejeon and Suncheon, respectively; the reliability is estimated through the Anderson-Darling method. In this study, to choose the appropriate confidence interval, the relationship between the amount of outlier data and the confidence level is demonstrated, and then the 95% is selected at a reasonable confidence level. The best model shows a smaller error ratio than the GEV while the Mahalanobis distance and outlier labelling methods results are compared and validated. The outlier labelling and Mahalanobis distance based on median shown higher validated error ratios compared to their mean equivalent suggesting, the methods sensitivity to data structure.

Water body extraction using block-based image partitioning and extension of water body boundaries (블록 기반의 영상 분할과 수계 경계의 확장을 이용한 수계 검출)

  • Ye, Chul-Soo
    • Korean Journal of Remote Sensing
    • /
    • v.32 no.5
    • /
    • pp.471-482
    • /
    • 2016
  • This paper presents an extraction method for water body which uses block-based image partitioning and extension of water body boundaries to improve the performance of supervised classification for water body extraction. The Mahalanobis distance image is created by computing the spectral information of Normalized Difference Water Index (NDWI) and Near Infrared (NIR) band images over a training site within the water body in order to extract an initial water body area. To reduce the effect of noise contained in the Mahalanobis distance image, we apply mean curvature diffusion to the image, which controls diffusion coefficients based on connectivity strength between adjacent pixels and then extract the initial water body area. After partitioning the extracted water body image into the non-overlapping blocks of same size, we update the water body area using the information of water body belonging to water body boundaries. The update is performed repeatedly under the condition that the statistical distance between water body area belonging to water body boundaries and the training site is not greater than a threshold value. The accuracy assessment of the proposed algorithm was tested using KOMPSAT-2 images for the various block sizes between $11{\times}11$ and $19{\times}19$. The overall accuracy and Kappa coefficient of the algorithm varied from 99.47% to 99.53% and from 95.07% to 95.80%, respectively.

An Exploratory Study on Donor Location Strategies in Data Fusion

  • Kim, Jonathan S.;Cho, Sung-Bin
    • Management Science and Financial Engineering
    • /
    • v.14 no.2
    • /
    • pp.1-12
    • /
    • 2008
  • This study explores several donor location strategies and discusses experiment results, which contributes to the saving of time and effort required in designing data fusion processes. In particular, three concepts are introduced. The Mahalanobis distance is applied to locate the nearest neighbors more effectively; which incorporates the covariance structure of attributes. The ideal point helps reduce the dimensionality problem that arises in conjoint-type experiments. The correspondence analysis is used to derive the coordinates from non-metric attributes. The Monte Carlo simulation results show that the proposed donor location strategies provide better fusion performance, compared to the currently-in-use methods.