• Title/Summary/Keyword: multivariate classification

Search Result 311, Processing Time 0.03 seconds

Pattern Recognition for Typification of Whiskies and Brandies in the Volatile Components using Gas Chromatographic Data

  • Myoung, Sungmin;Oh, Chang-Hwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.21 no.5
    • /
    • pp.167-175
    • /
    • 2016
  • The volatile component analysis of 82 commercialized liquors(44 samples of single malt whisky, 20 samples of blended whisky and 18 samples of brandy) was carried out by gas chromatography after liquid-liquid extraction with dichloromethane. Pattern recognition techniques such as principle component analysis(PCA), cluster analysis(CA), linear discriminant analysis(LDA) and partial least square discriminant analysis(PLSDA) were applied for the discrimination of different liquor categories. Classification rules were validated by considering sensitivity and specificity of each class. Both techniques, LDA and PLSDA, gave 100% sensitivity and specificity for all of the categories. These results suggested that the common characteristics and identities as typification of whiskies and brandys was founded by using multivariate data analysis method.

Electron-Morphometric Classification of the Native Honeybees from Korea. Part III. Discriminant Analysis for Different Localities Based on the Total Characters (한국산 재래꿀벌의 전자계량형태학적 분류. III. 전 47형질에 대한 각 지역간 판정분석)

  • 권용정;허은엽
    • Korean journal of applied entomology
    • /
    • v.32 no.1
    • /
    • pp.42-50
    • /
    • 1993
  • Some multivariate discriminant analyses were done for each population of the native honeybee workers (Apis cerana), which were selected for 11 different localities in spring and 12 in summer from Korea. When the comparison for different localities was conducted, the correct assignment was averaged at 91.67% in spring and 88.44% in summer. And for the comparison between the 2 different seasons, it was averaged at 97.58%. Whereas, that regardless of seasons revealed the lowest correct assignment at 70.16%.

  • PDF

REGIONAL CLASSIFICATION OF SHIZUOKA PREFECTURE WITH GIS BASED ON THE DATA OF WEATHER DISASTERS

  • HOTTA Asumi;IWASAKI Kazutaka
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.65-68
    • /
    • 2005
  • In order for effective disaster prevention, it is necessary to have some idea of when, where, why and what kind of weather disasters may occur, and how large they may be. But the regional characteristics of Shizuoka Prefecture from the viewpoint of weather disasters have not been studied before. In this study, the authors gathered the data which represent how many times weather disasters occurred in Shizuoka Prefecture in the last fourteen years, and then divided it into some regions using a multivariate analysis. The authors adopted principal component analysis on this data, and then adopted cluster analysis with principal component scores which must be significant in the previous analysis. Finally the authors set the regional division based on these clusters and described the regional characteristics. This study could contribute to the weather disaster prevention in Shizuoka Prefecture.

  • PDF

Order-Restricted Inference with Linear Rank Statistics in Microarray Data

  • Kang, Moon-Su
    • The Korean Journal of Applied Statistics
    • /
    • v.24 no.1
    • /
    • pp.137-143
    • /
    • 2011
  • The classification of subjects with unknown distribution in a small sample size often involves order-restricted constraints in multivariate parameter setups. Those problems make the optimality of a conventional likelihood ratio based statistical inferences not feasible. Fortunately, Roy (1953) introduced union-intersection principle(UIP) which provides an alternative avenue. Multivariate linear rank statistics along with that principle, yield a considerably appropriate robust testing procedure. Furthermore, conditionally distribution-free test based upon exact permutation theory is used to generate p-values, even in a small sample. Applications of this method are illustrated in a real microarray data example (Lobenhofer et al., 2002).

A Study on High Breakdown Discriminant Analysis : A Monte Carlo Simulation

  • Moon Sup;Young Joo;Youngjo
    • Communications for Statistical Applications and Methods
    • /
    • v.7 no.1
    • /
    • pp.225-232
    • /
    • 2000
  • The linear and quadratic discrimination functions based on normal theory are widely used to classify an observation to one of predefined groups. But the discriminant functions are sensitive to outliers. A high breakdown procedure to estimate location and scatter of multivariate data is the minimum volume ellipsoid or MVE estimator To obtain high breakdown classifiers outliers in multivariate data are detected by using the robust Mahalanobis distance based on MVE estimators and the weighted estimators are inserted in the functions for classification. A samll-sample MOnte Carlo study shows that the high breakdown robust procedures perform better than the classical classifiers.

  • PDF

Multivariate Analysis for Classification of Smog Type during the Summer Season in Seoul, Korea (다변량해석을 이용한 서울시 하계 스모그의 형태 분류)

  • 홍낙기;이종범;김용국
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.9 no.4
    • /
    • pp.278-287
    • /
    • 1993
  • In order to calssify smog type durnig the summer season in Seoul, air Quality and meterorological data were analyzed by multivariate analysis. Among 15 variables relating to visibility, 10 variables were selected by multiple regression analysis for clustering of smog types; total suspended particle, sulfur dioxide, ozone, ntrogen dioxide, total hydrocarbon, south-north wind component, ralative humidity, precipitable water, mixing height and air temperature. Somg types were grouped into three clusters using cubic clustering criterion and the mumbers of days in each cluster were contained 74, 28 and 16 days. Each cluster was seperated clearly by sulfur dioxide, precipitable water and air teperature. The first cluster was representative of high ozone concentration and prevailing meterological conditions for ozone formation. Therefore, visibility in the first cluster was considered to be affected by photochemical smog. The third cluster showed characteristics of sulphurous smog type due to the higher concentration of primary pollutant, based on the dry condition than that in another cluster. On the other hand, the characteristic of the second cluster was not relatively clear, but considered to be in an intermediate characteristic between photochemical smog and sulphurous smog type.

  • PDF

A Study of Predictive method of Daechung Dam Inflow Using Multivariate Neural Network Model (다변량 신경망 모형을 이용한 대청댐 유입량 산정에 관한 연구)

  • Kang, Kwon-Su;Yum, Kyung-Taek;Heo, Jun-Haeng
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2012.05a
    • /
    • pp.359-362
    • /
    • 2012
  • 수자원시스템의 설계, 계획, 운영에 있어 핵심적인 수문변수의 미래거동에 대한 보다 나은 추정치가 필요하다. 예를 들어, 수력발전, 레크리에이션 이용과 하류지역의 오염희석과 같은 다중 목적 기능을 유지하기 위하여 다목적댐을 운영할 때에, 다가오는 미래시간에 대한 계획된 유량의 예측이 요구된다. 예측의 목적은 미래에 발생할 정확한 예상치를 제공하는 것이다(Keith W. Hipel, 1994). 본 연구의 주요 목적은 금강수계인 대청댐에서 다변량 신경망 모형을 이용한 유입량 예측을 수행해 보는데 있다. 신경망 모형인 MLP, PCA, RBF모형 등을 이용하여 대청댐의 수문자료인 강우량, 유입량, 기온, 습도 등의 자료를 이용하여 최적의 모형을 탐색해 보고자 시도하였으며, 이중 New classification모형과 New Function Approximation Network모형이 타 모형보다 좋은 결과를 보여 주고 있음을 알 수 있었다.

  • PDF

Estimating the AUC of the MROC curve in the presence of measurement errors

  • G, Siva;R, Vishnu Vardhan;Kamath, Asha
    • Communications for Statistical Applications and Methods
    • /
    • v.29 no.5
    • /
    • pp.533-545
    • /
    • 2022
  • Collection of data on several variables, especially in the field of medicine, results in the problem of measurement errors. The presence of such measurement errors may influence the outcomes or estimates of the parameter in the model. In classification scenario, the presence of measurement errors will affect the intrinsic cum summary measures of Receiver Operating Characteristic (ROC) curve. In the context of ROC curve, only a few researchers have attempted to study the problem of measurement errors in estimating the area under their respective ROC curves in the framework of univariate setup. In this paper, we work on the estimation of area under the multivariate ROC curve in the presence of measurement errors. The proposed work is supported with a real dataset and simulation studies. Results show that the proposed bias-corrected estimator helps in correcting the AUC with minimum bias and minimum mean square error.

Case Studies Regarding the Classification of Public Caves (공개동굴의 유형분류에 관한 사례연구)

  • Hong, Hyun-Chul
    • Journal of the Speleological Society of Korea
    • /
    • no.93
    • /
    • pp.13-25
    • /
    • 2009
  • This study, which includes case studies that provide information of cave tour resources, considered a variety of selected variables of the internal and external parts of caves with the expanded factors of the academic classification in caves. It uses the cluster analysis, one of the multivariate analysis techniques, and applied the results for review. As a result, public caves can present multiple classification criteria according to the factors of the surrounding area's human environment. The result, classified by the region in public caves, is derived from this study.

Hybrid Learning Architectures for Advanced Data Mining:An Application to Binary Classification for Fraud Management (개선된 데이터마이닝을 위한 혼합 학습구조의 제시)

  • Kim, Steven H.;Shin, Sung-Woo
    • Journal of Information Technology Application
    • /
    • v.1
    • /
    • pp.173-211
    • /
    • 1999
  • The task of classification permeates all walks of life, from business and economics to science and public policy. In this context, nonlinear techniques from artificial intelligence have often proven to be more effective than the methods of classical statistics. The objective of knowledge discovery and data mining is to support decision making through the effective use of information. The automated approach to knowledge discovery is especially useful when dealing with large data sets or complex relationships. For many applications, automated software may find subtle patterns which escape the notice of manual analysis, or whose complexity exceeds the cognitive capabilities of humans. This paper explores the utility of a collaborative learning approach involving integrated models in the preprocessing and postprocessing stages. For instance, a genetic algorithm effects feature-weight optimization in a preprocessing module. Moreover, an inductive tree, artificial neural network (ANN), and k-nearest neighbor (kNN) techniques serve as postprocessing modules. More specifically, the postprocessors act as second0order classifiers which determine the best first-order classifier on a case-by-case basis. In addition to the second-order models, a voting scheme is investigated as a simple, but efficient, postprocessing model. The first-order models consist of statistical and machine learning models such as logistic regression (logit), multivariate discriminant analysis (MDA), ANN, and kNN. The genetic algorithm, inductive decision tree, and voting scheme act as kernel modules for collaborative learning. These ideas are explored against the background of a practical application relating to financial fraud management which exemplifies a binary classification problem.

  • PDF