• Title/Summary/Keyword: multivariate classification

Search Result 305, Processing Time 0.026 seconds

Surface Sediments Classification in Tidal Flats using Multivariate Kriging and KOMPSAT-2 Imagery (다변량 크리깅과 KOMPSAT-2 영상을 이용한 간석지 표층 퇴적물 분류)

  • LEE, Sang-Won;PARK, No-Wook;JANG, Dong-Ho;YOO, Hee Young;LIM, Hyosuk
    • Journal of The Geomorphological Association of Korea
    • /
    • v.19 no.3
    • /
    • pp.37-49
    • /
    • 2012
  • The objective of this paper is to propose a methodology for surface sediments classification in tidal flats that can combine ground survey data with high-resolution remote sensing data by multivariate kriging. Unlike conventional methodologies that have classified remote sensing data by using pre-classified sediment components, a new classification methodology presented in this paper first generates sediment component fraction maps and then classifies the sediments on a final stage. For generating sediment component fractions, regression kriging, as one of multivariate kriging algorithms, is applied to integrate ground survey data and remote sensing data. First, trend components of sand, silt, and clay are derived through regression analysis of ground survey data and spectral information from remote sensing data. Then, residuals at sample locations are computed and interpolated to generate residual components in the study area. Finally, the sediment component fractions are computed by adding the residuals to the trend components and are classified on a final stage. A case study at the Baramarae tidal flats with KOMPSAT-2 imagery is carried out to evaluate the classification capability of the proposed classification methodology. Through the case study, the proposed methodology showed the best classification accuracy, compared with the conventional classification methodologies. Especially, much improvement of classification accuracy for fine-grained sediments were also obtained. Therefore, it is expected that the presented classification methodology would be an effective one for surface sediments classification in tidal flats.

Clinical Relevance of the Tumor Location-Modified Lauren Classification System of Gastric Cancer

  • Choi, Jang Kyu;Park, Young Suk;Jung, Do Hyun;Son, Sang Yong;Ahn, Sang Hoon;Park, Do Joong;Kim, Hyung Ho
    • Journal of Gastric Cancer
    • /
    • v.15 no.3
    • /
    • pp.183-190
    • /
    • 2015
  • Purpose: The Lauren classification system is a very commonly used pathological classification system of gastric adenocarcinoma. A recent study proposed that the Lauren classification should be modified to include the anatomical location of the tumor. The resulting three types were found to differ significantly in terms of genomic expression profiles. This retrospective cohort study aimed to evaluate the clinical significance of the modified Lauren classification (MLC). Materials and Methods: A total of 677 consecutive patients who underwent curative gastrectomy from January 2005 to December 2007 for histologically confirmed gastric cancer were included. The patients were divided according to the MLC into proximal non-diffuse (PND), diffuse (D), and distal non-diffuse (DND) type. The groups were compared in terms of clinical features and overall survival. Multivariate analysis served to assess the association between MLC and prognosis. Results: Of the 677 patients, 48, 358, and 271 had PND, D, and DND, respectively. Their 5-year overall survival rates were 77.1%, 77.7%, and 90.4%. Compared to D and PND, DND was associated with significantly better overall survival (both P<0.01). Multivariate analysis showed that age, differentiation, lympho-vascular invasion, T and N stage, but not MLC, were independent prognostic factors for overall survival. Multivariate analysis of early gastric cancer patients showed that MLC was an independent prognostic factor for overall survival (odds ratio, 5.946; 95% confidence intervals, 1.524~23.197; P=0.010). Conclusions: MLC is prognostic for survival in patients with gastric adenocarcinoma, in early gastric cancer. DND was associated with an improved prognosis compared to PND or D.

Geographical Classification of Angelica gigas using UHPLC-DAD Combined Multivariate Analyses (UHPLC-DAD 및 다변량분석법을 이용한 참당귀의 산지감별법 연구)

  • Kim, Jung-Ryul;Lee, Dong Young;Sung, Sang Hyun;Kim, Jinwoong
    • Korean Journal of Pharmacognosy
    • /
    • v.44 no.4
    • /
    • pp.332-335
    • /
    • 2013
  • Geographical classification of A. gigas was performed in the present study using UHPLC-DAD combined with multivariate data analysis techniques. Six active constituents were isolated from A. gigas; nodakenin, marmesin, decursinol, demethylsuberosin, decursin and decursinol angelate. One hundred sixty eight A. gigas samples were simultaneously determined using UHPLC-DAD. A principal component analysis (PCA) and partial least square discriminant analysis (PLS-DA) was used to classify the samples according to geographical origins (Korea and China). The origins of A. gigas from Korea and China were correctly classified by 81.6% and 93.8% using PLS-DA Y prediction. This result demonstrates the potential use of UHPLC-DAD combined with multivariate analysis techniques as an accurate and rapid method to classify A. gigas according to their geographical origin.

AUTOMATED ELECTROFACIES DETERMINATION USING MULTIVARIATE STATISTICAL ANALYSIS

  • Kim Jungwhan;Lim Jong-Se
    • 한국석유지질학회:학술대회논문집
    • /
    • spring
    • /
    • pp.10-14
    • /
    • 1998
  • A systematic methodology is developed for the electrofacies determination from wireline log data using multivariate statistical analysis. To consider corresponding contribution of each log and reduce the computational dimension, multivariate logs are transformed into a single variable through principal components analysis. Resultant principal components logs are segmented using the statistical zonation method to enhance the efficiency and quality of the interpreted results. Hierarchical cluster analysis is then used to group the segments into electrofacies. Optimal number of groups is determined on the basis of the ratio of within-group variance to total variance and core data. This technique is applied to the wells in the Korea Continental Shelf. The results of field application demonstrate that the prediction of lithology based on the electrofacies classification matches well to the core and the cutting data with high reliability This methodology for electrofacies classification can be used to define the reservoir characteristics which are helpful to the reservoir management.

  • PDF

Prediction of arrhythmia using multivariate time series data (다변량 시계열 자료를 이용한 부정맥 예측)

  • Lee, Minhai;Noh, Hohsuk
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.5
    • /
    • pp.671-681
    • /
    • 2019
  • Studies on predicting arrhythmia using machine learning have been actively conducted with increasing number of arrhythmia patients. Existing studies have predicted arrhythmia based on multivariate data of feature variables extracted from RR interval data at a specific time point. In this study, we consider that the pattern of the heart state changes with time can be important information for the arrhythmia prediction. Therefore, we investigate the usefulness of predicting the arrhythmia with multivariate time series data obtained by extracting and accumulating the multivariate vectors of the feature variables at various time points. When considering 1-nearest neighbor classification method and its ensemble for comparison, it is confirmed that the multivariate time series data based method can have better classification performance than the multivariate data based method if we select an appropriate time series distance function.

Empirical Bayes Posterior Odds Ratio for Heteroscedastic Classification

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.16 no.2
    • /
    • pp.92-101
    • /
    • 1987
  • Our interest is to access in some way teh relative odds or probability that a multivariate observation Z belongs to one of k multivariate normal populations with unequal covariance matrices. We derived the empirical Bayes posterior odds ratio for the classification rule when population parameters are unknown. It is a generalization of the posterior odds ratio suggested by Gelsser (1964). The classification rule does not have complicated distribution theory which a large variety of techniques from the sampling viewpoint have. The proposed posterior odds ratio is compared to the Gelsser's posterior odds ratio through a Monte Carlo study. The results show that the empiricla Bayes posterior odds ratio, in general, performs better than the Gelsser's. Especially, for large dimension of Z and small training sample, the performance is prominent.

  • PDF

Non-Destructive Sorting Techniques for Viable Pepper (Capsicum annuum L.) Seeds Using Fourier Transform Near-Infrared and Raman Spectroscopy

  • Seo, Young-Wook;Ahn, Chi Kook;Lee, Hoonsoo;Park, Eunsoo;Mo, Changyeun;Cho, Byoung-Kwan
    • Journal of Biosystems Engineering
    • /
    • v.41 no.1
    • /
    • pp.51-59
    • /
    • 2016
  • Purpose: This study examined the performance of two spectroscopy methods and multivariate classification methods to discriminate viable pepper seeds from their non-viable counterparts. Methods: A classification model for viable seeds was developed using partial least square discrimination analysis (PLS-DA) with Fourier transform near-infrared (FT-NIR) and Raman spectroscopic data in the range of $9080-4150cm^{-1}$ (1400-2400 nm) and $1800-970cm^{-1}$, respectively. The datasets were divided into 70% to calibration and 30% to validation. To reduce noise from the spectra and compare the classification results, preprocessing methods, such as mean, maximum, and range normalization, multivariate scattering correction, standard normal variate, and $1^{st}$ and $2^{nd}$ derivatives with the Savitzky-Golay algorithm were used. Results: The classification accuracies for calibration using FT-NIR and Raman spectroscopy were both 99% with first derivative, whereas the validation accuracies were 90.5% with both multivariate scattering correction and standard normal variate, and 96.4% with the raw data (non-preprocessed data). Conclusions: These results indicate that FT-NIR and Raman spectroscopy are valuable tools for a feasible classification and evaluation of viable pepper seeds by providing useful information based on PLS-DA and the threshold value.

Automatic Electrofacies Classification from Well Logs Using Multivariate Statistical Techniques (다변량 통계 기법을 이용한 물리검층 자료로부터의 암석물리학상 결정)

  • Lim Jong-Se;Kim Jungwhan;Kang Joo-Myung
    • Geophysics and Geophysical Exploration
    • /
    • v.1 no.3
    • /
    • pp.170-175
    • /
    • 1998
  • A systematic methodology is developed for the prediction of the lithology using electrofacies classification from wireline log data. Multivariate statistical techniques are adopted to segment well log measurements and group the segments into electrofacies types. To consider corresponding contribution of each log and reduce the computational dimension, multivariate logs are transformed into a single variable through principal components analysis. Resultant principal components logs are segmented using the statistical zonation method to enhance the quality and efficiency of the interpreted results. Hierarchical cluster analysis is then used to group the segments into electrofacies. Optimal number of groups is determined on the basis of the ratio of within-group variance to total variance and core data. This technique is applied to the wells in the Korea Continental Shelf. The results of field application demonstrate that the prediction of lithology based on the electrofacies classification works well with reliability to the core and cutting data. This methodology for electrofacies determination can be used to define reservoir characterization which is helpful to the reservoir management.

  • PDF

A Study on Forest Land Classification Using Multivariate Statistical Methods : A Case Study at Mt. Kwanak (다변수통계방법을 이용한 산지분류에 관한 연구)

  • 정순오
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.13 no.1
    • /
    • pp.43-66
    • /
    • 1985
  • Korea needs proper and rational public policies on conservation and use of forest land and other natural resources because of the accelerating expansion of national land developments in recent years. Unfortunately, there is no systematic planning system to support the needs. Generally, forest land use planning needs suitability analysis based on efficient land classification system. The goal of this study was to classify a forest land using multivariate satistical methods. A case study was carried out in winter of 1983 on a mountainous area higher than 100m above sea level located at Mt. Kwanak in Anyang -city, Kyung-gi-do (province). The study area was 19.80 km$^2$wide and was divided into 1, 383 Operational Taxonomic Units (OTU's) by a 120m$\times$120m grid. Fourteen descriptors were identified and quantified for each OTU from existing national land data : elevation, slope, aspect, terrain form, geologic material, surface soil permeability, topsoil type, depth of the solum, soil acidity, forest cover type, stand size class, stand age class, stand density class, and simple forest soil capability class. For this study, a FORTRAN IV program was written for input and output map data, and the computer statistics packages, SPSS and BMD, were used to perform the multivariate statistical analysis. Fourteen variables were analyzed to investigate the characteristics of their fire quench distribution and to estimate the correlation coefficients among them. Principal component analysis was executed to find the dimensions of forest land characteristics, and factor scores were used for proper samples of OTU throughout the study area. In order to develop the classes of forest land classification based on 102 surrogates, cluster and discriminant analyses of principal descriptor variable matrix were undertaken. Results obtained through a series of multivariate statistical analyses were as follows ; 1) Principal component analysis was proved to be a useful tool for data selection and identification of principal descriptor variables which represented the characteristics of forest land and facilitated the selection of samples.

  • PDF

Bootstrap Confidence Intervals of Classification Error Rate for a Block of Missing Observations

  • Chung, Hie-Choon
    • Communications for Statistical Applications and Methods
    • /
    • v.16 no.4
    • /
    • pp.675-686
    • /
    • 2009
  • In this paper, it will be assumed that there are two distinct populations which are multivariate normal with equal covariance matrix. We also assume that the two populations are equally likely and the costs of misclassification are equal. The classification rule depends on the situation when the training samples include missing values or not. We consider the bootstrap confidence intervals for classification error rate when a block of observation is missing.