• Title/Summary/Keyword: k-NN classification

Search Result 188, Processing Time 0.023 seconds

Biometrics Based on Multi-View Features of Teeth Using Principal Component Analysis (주성분분석을 이용한 치아의 다면 특징 기반 생체식별)

  • Chang, Chan-Wuk;Kim, Myung-Su;Shin, Young-Suk
    • Korean Journal of Cognitive Science
    • /
    • v.18 no.4
    • /
    • pp.445-455
    • /
    • 2007
  • We present a new biometric identification system based on multi-view features of teeth using principal components analysis(PCA). The multi-view features of teeth consist of the frontal view, the left side view and the right side view. In this paper, we try to stan the foundations of a dental biometrics for secure access in real life environment. We took the pictures of the three views teeth in the experimental environment designed specially and 42 principal components as the features for individual identification were developed. The classification for individual identification based on the nearest neighbor(NN) algorithm is created with the distance between the multi-view teeth and the multi-view teeth rotated. The identification performance after rotating two degree of test data is 95.2% on the left side view teeth and 91.3% on the right side view teeth as the average values.

  • PDF

Emotion Recognition Using Color and Pattern in Textile Images (컬러와 패턴을 이용한 텍스타일 영상에서의 감정인식 시스템)

  • Shin, Yun-Hee;Kim, Young-Rae;Kim, Eun-Yi
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.45 no.6
    • /
    • pp.154-161
    • /
    • 2008
  • In this paper, a novel method is proposed using color and pattern information for recognizing some emotions included in a fertile. Here we use 10 Kobayashi emotion to represent emotions. - { romantic, clear, natural, casual, elegant chic, dynamic, classic, dandy, modem } The proposed system is composed of feature extraction and classification. To transform the subjective emotions as physical visual features, we extract representative colors and Patterns from textile. Here, the representative color prototypes are extracted by color quantization method, and patterns exacted by wavelet transform followed by statistical analysis. These exacted features are given as input to the neural network (NN)-based classifiers, which decides whether or not a textile had the corresponding emotion. When assessing the effectiveness of the proposed system with 389 textiles collected from various application domains such as interior, fashion, and artificial ones. The results showed that the proposed method has the precision of 100% and the recall of 99%, thereby it can be used in various textile industries.

Analysis of Heart Rate Variability in Constitution Types During Active and Passive Coping Caused dy Electroacupuncture (통증으로 유발한 능동 및 수동 대처상황에서 체질에 따른 Heart Rate Variability 분석)

  • Kim Jin-Keun;Jang Kyeong-Seon;Lee Sang-Kwan
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.20 no.1
    • /
    • pp.115-124
    • /
    • 2006
  • The purpose of this study is to investigate the relationships between the biological base of coping strategy and the different constitutions. First of all, subjects were divided in to 3 groups dy Questionnaire for the Sasang Constitution Classification II and Yin-Yang Property Analysis. Then each group was assigned into two experimental coping conditions, active and passive condition, in turn. The SDNN(The Standard Deviation of the NN Intervals) of HRV(Heart rate variability) index was estimated from two conditions after giving a aversive pain stimulus. The results of the study were as follows 1. The interaction between constitution and coping condition is significant(p<0.05). 2. The SDNNs of Shaoyangren are higher than those of Taiyinren under passive condition but it was opposite under active condition(p<0.05). 3. The main effect of constitution is also significant but that of coping condition is not significant. 4. Thee Shaoyangren is higher than Shaoyinren in multiple comparisons(p<0.05). 5. The interaction between Yin-Yang constitution and coping condition is significant and the main effect of only constitution is significant(p<0.05). According to these results, different constitution can respond differently to coping condition and It is highly related to biological mechanism associated with two basic coping strategies.

Basic Research on the Possibility of Developing a Landscape Perceptual Response Prediction Model Using Artificial Intelligence - Focusing on Machine Learning Techniques - (인공지능을 활용한 경관 지각반응 예측모델 개발 가능성 기초연구 - 머신러닝 기법을 중심으로 -)

  • Kim, Jin-Pyo;Suh, Joo-Hwan
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.51 no.3
    • /
    • pp.70-82
    • /
    • 2023
  • The recent surge of IT and data acquisition is shifting the paradigm in all aspects of life, and these advances are also affecting academic fields. Research topics and methods are being improved through academic exchange and connections. In particular, data-based research methods are employed in various academic fields, including landscape architecture, where continuous research is needed. Therefore, this study aims to investigate the possibility of developing a landscape preference evaluation and prediction model using machine learning, a branch of Artificial Intelligence, reflecting the current situation. To achieve the goal of this study, machine learning techniques were applied to the landscaping field to build a landscape preference evaluation and prediction model to verify the simulation accuracy of the model. For this, wind power facility landscape images, recently attracting attention as a renewable energy source, were selected as the research objects. For analysis, images of the wind power facility landscapes were collected using web crawling techniques, and an analysis dataset was built. Orange version 3.33, a program from the University of Ljubljana was used for machine learning analysis to derive a prediction model with excellent performance. IA model that integrates the evaluation criteria of machine learning and a separate model structure for the evaluation criteria were used to generate a model using kNN, SVM, Random Forest, Logistic Regression, and Neural Network algorithms suitable for machine learning classification models. The performance evaluation of the generated models was conducted to derive the most suitable prediction model. The prediction model derived in this study separately evaluates three evaluation criteria, including classification by type of landscape, classification by distance between landscape and target, and classification by preference, and then synthesizes and predicts results. As a result of the study, a prediction model with a high accuracy of 0.986 for the evaluation criterion according to the type of landscape, 0.973 for the evaluation criterion according to the distance, and 0.952 for the evaluation criterion according to the preference was developed, and it can be seen that the verification process through the evaluation of data prediction results exceeds the required performance value of the model. As an experimental attempt to investigate the possibility of developing a prediction model using machine learning in landscape-related research, this study was able to confirm the possibility of creating a high-performance prediction model by building a data set through the collection and refinement of image data and subsequently utilizing it in landscape-related research fields. Based on the results, implications, and limitations of this study, it is believed that it is possible to develop various types of landscape prediction models, including wind power facility natural, and cultural landscapes. Machine learning techniques can be more useful and valuable in the field of landscape architecture by exploring and applying research methods appropriate to the topic, reducing the time of data classification through the study of a model that classifies images according to landscape types or analyzing the importance of landscape planning factors through the analysis of landscape prediction factors using machine learning.

A Comparative Study of Prediction Models for College Student Dropout Risk Using Machine Learning: Focusing on the case of N university (머신러닝을 활용한 대학생 중도탈락 위험군의 예측모델 비교 연구 : N대학 사례를 중심으로)

  • So-Hyun Kim;Sung-Hyoun Cho
    • Journal of The Korean Society of Integrative Medicine
    • /
    • v.12 no.2
    • /
    • pp.155-166
    • /
    • 2024
  • Purpose : This study aims to identify key factors for predicting dropout risk at the university level and to provide a foundation for policy development aimed at dropout prevention. This study explores the optimal machine learning algorithm by comparing the performance of various algorithms using data on college students' dropout risks. Methods : We collected data on factors influencing dropout risk and propensity were collected from N University. The collected data were applied to several machine learning algorithms, including random forest, decision tree, artificial neural network, logistic regression, support vector machine (SVM), k-nearest neighbor (k-NN) classification, and Naive Bayes. The performance of these models was compared and evaluated, with a focus on predictive validity and the identification of significant dropout factors through the information gain index of machine learning. Results : The binary logistic regression analysis showed that the year of the program, department, grades, and year of entry had a statistically significant effect on the dropout risk. The performance of each machine learning algorithm showed that random forest performed the best. The results showed that the relative importance of the predictor variables was highest for department, age, grade, and residence, in the order of whether or not they matched the school location. Conclusion : Machine learning-based prediction of dropout risk focuses on the early identification of students at risk. The types and causes of dropout crises vary significantly among students. It is important to identify the types and causes of dropout crises so that appropriate actions and support can be taken to remove risk factors and increase protective factors. The relative importance of the factors affecting dropout risk found in this study will help guide educational prescriptions for preventing college student dropout.

3D Face Recognition using Wavelet Transform Based on Fuzzy Clustering Algorithm (펴지 군집화 알고리즘 기반의 웨이블릿 변환을 이용한 3차원 얼굴 인식)

  • Lee, Yeung-Hak
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.11
    • /
    • pp.1501-1514
    • /
    • 2008
  • The face shape extracted by the depth values has different appearance as the most important facial information. The face images decomposed into frequency subband are signified personal features in detail. In this paper, we develop a method for recognizing the range face images by multiple frequency domains for each depth image using the modified fuzzy c-mean algorithm. For the proposed approach, the first step tries to find the nose tip that has a protrusion shape on the face from the extracted face area. And the second step takes into consideration of the orientated frontal posture to normalize. Multiple contour line areas which have a different shape for each person are extracted by the depth threshold values from the reference point, nose tip. And then, the frequency component extracted from the wavelet subband can be adopted as feature information for the authentication problems. The third step of approach concerns the application of eigenface to reduce the dimension. And the linear discriminant analysis (LDA) method to improve the classification ability between the similar features is adapted. In the last step, the individual classifiers using the modified fuzzy c-mean method based on the K-NN to initialize the membership degree is explained for extracted coefficient at each resolution level. In the experimental results, using the depth threshold value 60 (DT60) showed the highest recognition rate among the extracted regions, and the proposed classification method achieved 98.3% recognition rate, incase of fuzzy cluster.

  • PDF

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Optimizing Similarity Threshold and Coverage of CBR (사례기반추론의 유사 임계치 및 커버리지 최적화)

  • Ahn, Hyunchul
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.8
    • /
    • pp.535-542
    • /
    • 2013
  • Since case-based reasoning(CBR) has many advantages, it has been used for supporting decision making in various areas including medical checkup, production planning, customer classification, and so on. However, there are several factors to be set by heuristics when designing effective CBR systems. Among these factors, this study addresses the issue of selecting appropriate neighbors in case retrieval step. As the criterion for selecting appropriate neighbors, conventional studies have used the preset number of neighbors to combine(i.e. k of k-nearest neighbor), or the relative portion of the maximum similarity. However, this study proposes to use the absolute similarity threshold varying from 0 to 1, as the criterion for selecting appropriate neighbors to combine. In this case, too small similarity threshold value may make the model rarely produce the solution. To avoid this, we propose to adopt the coverage, which implies the ratio of the cases in which solutions are produced over the total number of the training cases, and to set it as the constraint when optimizing the similarity threshold. To validate the usefulness of the proposed model, we applied it to a real-world target marketing case of an online shopping mall in Korea. As a result, we found that the proposed model might significantly improve the performance of CBR.