• Title/Summary/Keyword: data discriminant analysis

Search Result 771, Processing Time 0.025 seconds

Comparison of 12 Isoflavone Profiles of Soybean (Glycine max (L.) Merrill) Seed Sprouts from Three Different Countries

  • Park, Soo-Yun;Kim, Jae Kwang;Kim, Eun-Hye;Kim, Seung-Hyun;Prabakaran, Mayakrishnan;Chung, Ill-Min
    • KOREAN JOURNAL OF CROP SCIENCE
    • /
    • v.63 no.4
    • /
    • pp.360-377
    • /
    • 2018
  • The levels of 12 isoflavones were measured in soybean (Glycine max (L.) Merrill) sprouts of 68 genetic varieties from three countries (China, Japan, and Korea). The isoflavone profile differences were analyzed using data mining methods. A principal component analysis (PCA) revealed that the CSRV021 variety was separated from the others by the first two principal components. This variety appears to be most suited for functional food production due to its high isoflavone levels. Partial least squares discriminant analysis (PLS-DA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) showed that there are meaningful isoflavone compositional differences in samples that have different countries of origin. Hierarchical clustering analysis (HCA) of these phytochemicals resulted in clusters derived from closely related biochemical pathways. These results indicate the usefulness of metabolite profiling combined with chemometrics as a tool for assessing the quality of foods and identifying metabolic links in biological systems.

Automatic Estimation of Artemia Hatching Rate Using an Object Discrimination Method

  • Kim, Sung;Cho, Hong-Yeon
    • Ocean and Polar Research
    • /
    • v.35 no.3
    • /
    • pp.239-247
    • /
    • 2013
  • Digital image processing is a process to analyze a large volume of information on digital images. In this study, Artemia hatching rate was measured by automatically classifying and counting cysts and larvae based on color imaging data from cyst hatching experiments using an image processing technique. The Artemia hatching rate estimation consists of a series of processes; a step to convert the scanned image data to a binary image data, a process to detect objects and to extract their shape information in the converted image data, an analysis step to choose an optimal discriminant function, and a step to recognize and classify the objects using the function. The function to classify Artemia cysts and larvae is optimally estimated based on the classification performance using the areas and the plan-form factors of the detected objects. The hatching rate using the image data obtained under the different experimental conditions was estimated in the range of 34-48%. It was shown that the maximum difference is about 19.7% and the average root-mean squared difference is about 10.9% as the difference between the results using an automatic counting (this study) and a manual counting were compared. This technique can be applied to biological specimen analysis using similar imaging information.

Optimized Polynomial Neural Network Classifier Designed with the Aid of Space Search Simultaneous Tuning Strategy and Data Preprocessing Techniques

  • Huang, Wei;Oh, Sung-Kwun
    • Journal of Electrical Engineering and Technology
    • /
    • v.12 no.2
    • /
    • pp.911-917
    • /
    • 2017
  • There are generally three folds when developing neural network classifiers. They are as follows: 1) discriminant function; 2) lots of parameters in the design of classifier; and 3) high dimensional training data. Along with this viewpoint, we propose space search optimized polynomial neural network classifier (PNNC) with the aid of data preprocessing technique and simultaneous tuning strategy, which is a balance optimization strategy used in the design of PNNC when running space search optimization. Unlike the conventional probabilistic neural network classifier, the proposed neural network classifier adopts two type of polynomials for developing discriminant functions. The overall optimization of PNNC is realized with the aid of so-called structure optimization and parameter optimization with the use of simultaneous tuning strategy. Space search optimization algorithm is considered as a optimize vehicle to help the implement both structure and parameter optimization in the construction of PNNC. Furthermore, principal component analysis and linear discriminate analysis are selected as the data preprocessing techniques for PNNC. Experimental results show that the proposed neural network classifier obtains better performance in comparison with some other well-known classifiers in terms of accuracy classification rate.

Discriminant Analysis of Marketed Liquor by a Multi-channel Taste Evaluation System

  • Kim, Nam-Soo
    • Food Science and Biotechnology
    • /
    • v.14 no.4
    • /
    • pp.554-557
    • /
    • 2005
  • As a device for taste sensation, an 8-channel taste evaluation system was prepared and applied for discriminant analysis of marketed liquor. The biomimetic polymer membranes for the system were prepared through a casting procedure by employing polyvinyl chloride, bis (2-ethylhexyl)sebacate as plasticizer and electroactive materials such as valinomycin in the ratio of 33:66:1, and were separately attached over the sensitive area of ion-selective electrodes to construct the corresponding taste sensor array. The sensor array in conjunction with a double junction reference electrode was connected to a high-input impedance amplifier and the amplified sensor signals were interfaced to a personal computer via an A/D converter. When the signal data from the sensor array for 3 groups of marketed liquor like Maesilju, Soju and beer were analyzed by principal component analysis after normalization, it was observed that the 1st, 2nd and 3rd principal component were responsible for most of the total data variance, and the analyzed liquor samples were discriminated well in 2 dimensional principal component planes composed of the 1st-2nd and the 1st-3rd principal component.

Discriminant Analysis under a Patterned Missing Values

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.18 no.1
    • /
    • pp.13-25
    • /
    • 1989
  • This paper suggests a classification rule with unequal covariance matrices when a patterned incomplete data are involved in the discriminant analysis. This is an extension of Geisser's (1966) result to the case of missing observations. For the calssificaiton rule, we introduce an algorithm which contains data augmentation step and Monte Carlo integration step and show that the algorithm yields a consistant estimator of true classification probability. The proposed method is compared to the complete observation vector method through a Monte Carlo study. The results show that the suggested method, in general, performs better than the complete observation vector method which ignores those vectors of observation with one or more missing values from the analysis. The results also verify the consistency of the algorithm.

  • PDF

Hybrid Pattern Recognition Using a Combination of Different Features

  • Choi, Sang-Il
    • Journal of the Korea Society of Computer and Information
    • /
    • v.20 no.11
    • /
    • pp.9-16
    • /
    • 2015
  • We propose a hybrid pattern recognition method that effectively combines two different features for improving data classification. We first extract the PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis) features, both of which are widely used in pattern recognition, to construct a set of basic features, and then evaluate the separability of each basic feature. According to the results of evaluation, we select only the basic features that contain a large amount of discriminative information for construction of the combined features. The experimental results for the various data sets in the UCI machine learning repository show that using the proposed combined features give better recognition rates than when solely using the PCA or LDA features.

Optimal number of dimensions in linear discriminant analysis for sparse data (희박한 데이터에 대한 선형판별분석에서 최적의 차원 수 결정)

  • Shin, Ga In;Kim, Jaejik
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.867-876
    • /
    • 2017
  • Datasets with small n and large p are often found in various fields and the analysis of the datasets is still a challenge in statistics. Discriminant analysis models for such datasets were recently developed in classification problems. One approach of those models tries to detect dimensions that distinguish between groups well and the number of the detected dimensions is typically smaller than p. In such models, the number of dimensions is important because the prediction and visualization of data and can be usually determined by the K-fold cross-validation (CV). However, in sparse data scenarios, the CV is not reliable for determining the optimal number of dimensions since there can be only a few observations for each fold. Thus, we propose a method to determine the number of dimensions using a measure based on the standardized distance between the mean values of each group in the reduced dimensions. The proposed method is verified through simulations.

Development and Application of Water Quality Level Model (WQLM) for the Small Streams of Rural Watersheds with Discriminant Analysis (판별분석을 통한 농촌유역 소하천의 수질등급모형(WQLM) 개발 및 적용)

  • Kim, Jin-Ho;Choi, Chul-Mann;Ryu, Jong-Soo;Jung, Goo-Bok;Shin, Joung-Du;Han, Kuk-Heon;Lee, Jung-Taek;Kwun, Soon-Kuk
    • Journal of Korean Society on Water Environment
    • /
    • v.23 no.2
    • /
    • pp.260-265
    • /
    • 2007
  • This study was carried out to complement water quality standards and to establish new concept for water quality standards reflecting current state of water quality in small streams. By this purpose, discriminant analysis was performed and Water Quality Level Model (WQLM) was developed using the data such as EC, BOD, $COD_{Mn}$, SS, T-N, T-P, $NH_3-N$ in 224 agricultural streams. To give water quality level for water quality parameters, it divided into 20% respectively in the order of excellent water quality. On the basis of the lowest water quality level, water quality level of small streams is granted. As a result of it, number of stream corresponding to Level I was no, Level II was 2 streams, Level III was 22 streams, Level IV was 70 streams, and Level V was 130 streams. Average of water quality in each level was the highest in Level V. EC, SS, and T-N of 7 parameters were selected in variance concerned water quality level. By standardized canonical discriminant function coefficient, EC of three variances was the highest in 0.625 at the discriminant power. The next was T-N (0.509), SS (0.414). By discriminant function for water quality level, Level II was equal to $-2.973+19.376{\times}(EC)+0.647{\times}(T-N)+0.009{\times}(SS)$, Level III was equal to $-3.288+19.190{\times}(EC)+0.733{\times}(T-N)+0.041{\times}(SS)$, Level IV was equal to $-4.462+27.097{\times}(EC)+0.792{\times}(T-N)+0.053{\times}(SS)$, and Level V was equal to $-9.117+40.040{\times}(EC)+1.305{\times}(T-N)+0.111{\times}(SS)$. As a result of test at real agricultural watershed of Jeongan and Euidang in Gongju city, the fitness of WQLM was high to 88.78%. But, to get accomplished water quality assessment more exactly in agricultural streams, we had to concentrate and get vast data, and WQLM was modified and complemented continually.

Classification of Side Somatotype of the Trunk by Analysing Photographic Data (사진자료에 의한 여성 상반신 측면체형 분류)

  • Jung, Myong-Seok
    • Korean Journal of Human Ecology
    • /
    • v.12 no.5
    • /
    • pp.767-776
    • /
    • 2003
  • The purpose of this study was to classify side somatotypes of the trunk by analysing photographic data. Then their distribution according to the age groups was studied. The subjects were 315 females of 18 to 49 year-old. Thirty one photographic measurements were taken to each subject. The factors affecting the side somatotype of the trunk were obtained by principal component analysis, vertical size, posterior/anterior depth and neck posture. The side somatotypes of the trunk were classified into 4 types and their differences were shown by analysing photographic data. The side silhouettes of 4 types were compared with balanced type. By suggesting the canonical discriminant function with the unstandardized canonical coefficient, individual somatotype of the trunk could be discriminated from the photographic data of anterior neck height, anterior waist height, posterior waist depth, buttock height, and anterior depth at the level of back protrusion. The frequency distribution of the side somatotypes of the trunk according to the age groups could be applied for clothing construction and the rate of clothing production.

  • PDF

Influence of Website Attributes on the Visit to Plastic Surgery Websites (성형외과 의원의 웹 방문자 수에 영향을 미치는 웹 사이트 속성)

  • Cho, Yeong-Bin;An, Seong-Hyeon
    • Journal of Information Technology Applications and Management
    • /
    • v.14 no.3
    • /
    • pp.137-149
    • /
    • 2007
  • Most of hospitals, especially small-scale hospitals, have tried to get customers through the Internet as what companies have done recently. There are various attempts that increase visits to one's web-site in plastic surgery hospitals. However, in plastic surgery, there have been few studies on which an attribute contributes to increase the number of web-site visit. In order to derive the important attributes on the number of visit, we compared functional attributes of 30 high-visit plastic surgery web-sites with those of 30 low-visit web-sites using statistical and data mining methods. For analysis, three methods have conducted including Multiple Discriminant Analysis (statistical method), Decision Trees (data mining method), and Artificial Neural Network (data mining method). Furthermore, results of each method have been evaluated one another. The result of this study shows that a few attributes like 'Simulating cyber plastic surgery program', 'recommendation of information' explain the number of the visitors between high and low visit web-site. The methodology employed in this study provides an efficient way of improving satisfaction of visitors of plastic surgery website.

  • PDF