• Title/Summary/Keyword: classification algorithm

Search Result 2,893, Processing Time 0.033 seconds

A New Pattern Classification and the Analysis of the Lung Sound by Using Cepstrum (Cepstrum을 이용한 폐음의 분석 및 패턴 분류)

  • 김종원;김성환
    • Journal of Biomedical Engineering Research
    • /
    • v.15 no.2
    • /
    • pp.159-166
    • /
    • 1994
  • A new pattern classification algorithm using cepstrum to analyze lung sounds for the classification of pattern with pulmonary and bronchial disorders is proposed. To evaluate the perfomance of the proposed method, the results are compared to the pattern classification with the AR modeling method. In the experiment lung sounds recorded for the training of physician used. As a results, the accuracy of the cepstrum classification is 92.3 % and AR modeling is the 53.8 %, therefore cepstrum modeling method has very high performance than AR and it turned out to be a very efficient algorithm.

  • PDF

The Effect of Meta-Features of Multiclass Datasets on the Performance of Classification Algorithms (다중 클래스 데이터셋의 메타특징이 판별 알고리즘의 성능에 미치는 영향 연구)

  • Kim, Jeonghun;Kim, Min Yong;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.23-45
    • /
    • 2020
  • Big data is creating in a wide variety of fields such as medical care, manufacturing, logistics, sales site, SNS, and the dataset characteristics are also diverse. In order to secure the competitiveness of companies, it is necessary to improve decision-making capacity using a classification algorithm. However, most of them do not have sufficient knowledge on what kind of classification algorithm is appropriate for a specific problem area. In other words, determining which classification algorithm is appropriate depending on the characteristics of the dataset was has been a task that required expertise and effort. This is because the relationship between the characteristics of datasets (called meta-features) and the performance of classification algorithms has not been fully understood. Moreover, there has been little research on meta-features reflecting the characteristics of multi-class. Therefore, the purpose of this study is to empirically analyze whether meta-features of multi-class datasets have a significant effect on the performance of classification algorithms. In this study, meta-features of multi-class datasets were identified into two factors, (the data structure and the data complexity,) and seven representative meta-features were selected. Among those, we included the Herfindahl-Hirschman Index (HHI), originally a market concentration measurement index, in the meta-features to replace IR(Imbalanced Ratio). Also, we developed a new index called Reverse ReLU Silhouette Score into the meta-feature set. Among the UCI Machine Learning Repository data, six representative datasets (Balance Scale, PageBlocks, Car Evaluation, User Knowledge-Modeling, Wine Quality(red), Contraceptive Method Choice) were selected. The class of each dataset was classified by using the classification algorithms (KNN, Logistic Regression, Nave Bayes, Random Forest, and SVM) selected in the study. For each dataset, we applied 10-fold cross validation method. 10% to 100% oversampling method is applied for each fold and meta-features of the dataset is measured. The meta-features selected are HHI, Number of Classes, Number of Features, Entropy, Reverse ReLU Silhouette Score, Nonlinearity of Linear Classifier, Hub Score. F1-score was selected as the dependent variable. As a result, the results of this study showed that the six meta-features including Reverse ReLU Silhouette Score and HHI proposed in this study have a significant effect on the classification performance. (1) The meta-features HHI proposed in this study was significant in the classification performance. (2) The number of variables has a significant effect on the classification performance, unlike the number of classes, but it has a positive effect. (3) The number of classes has a negative effect on the performance of classification. (4) Entropy has a significant effect on the performance of classification. (5) The Reverse ReLU Silhouette Score also significantly affects the classification performance at a significant level of 0.01. (6) The nonlinearity of linear classifiers has a significant negative effect on classification performance. In addition, the results of the analysis by the classification algorithms were also consistent. In the regression analysis by classification algorithm, Naïve Bayes algorithm does not have a significant effect on the number of variables unlike other classification algorithms. This study has two theoretical contributions: (1) two new meta-features (HHI, Reverse ReLU Silhouette score) was proved to be significant. (2) The effects of data characteristics on the performance of classification were investigated using meta-features. The practical contribution points (1) can be utilized in the development of classification algorithm recommendation system according to the characteristics of datasets. (2) Many data scientists are often testing by adjusting the parameters of the algorithm to find the optimal algorithm for the situation because the characteristics of the data are different. In this process, excessive waste of resources occurs due to hardware, cost, time, and manpower. This study is expected to be useful for machine learning, data mining researchers, practitioners, and machine learning-based system developers. The composition of this study consists of introduction, related research, research model, experiment, conclusion and discussion.

Fuzzy Classification Using EM Algorithm

  • Lee Sang-Hoon
    • Proceedings of the KSRS Conference
    • /
    • 2005.10a
    • /
    • pp.675-677
    • /
    • 2005
  • This study proposes a fuzzy classification using EM algorithm. For cluster validation, this approach iteratively estimates the class-parameters in the fuzzy training for the sample classes and continuously computes the log-likelihood ratio of two consecutive class-numbers. The maximum ratio rule is applied to determine the optimal number of classes.

  • PDF

Data Classification Using the Robbins-Monro Stochastic Approximation Algorithm (로빈스-몬로 확률 근사 알고리즘을 이용한 데이터 분류)

  • Lee, Jae-Kook;Ko, Chun-Taek;Choi, Won-Ho
    • Proceedings of the KIPE Conference
    • /
    • 2005.07a
    • /
    • pp.624-627
    • /
    • 2005
  • This paper presents a new data classification method using the Robbins Monro stochastic approximation algorithm k-nearest neighbor and distribution analysis. To cluster the data set, we decide the centroid of the test data set using k-nearest neighbor algorithm and the local area of data set. To decide each class of the data, the Robbins Monro stochastic approximation algorithm is applied to the decided local area of the data set. To evaluate the performance, the proposed classification method is compared to the conventional fuzzy c-mean method and k-nn algorithm. The simulation results show that the proposed method is more accurate than fuzzy c-mean method, k-nn algorithm and discriminant analysis algorithm.

  • PDF

Multiclass-based AdaBoost Algorithm (다중 클래스 아다부스트 알고리즘)

  • Kim, Tae-Hyun;Park, Dong-Chul
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.1
    • /
    • pp.44-50
    • /
    • 2011
  • We propose a multi-class AdaBoost algorithm for en efficient classification of multi-class data in this paper. Traditional AdaBoost algorithm is basically a binary classifier and it has limitations when applied to multi-class data problems even though multi-class versions are available. In order to overcome the problems on the AdaBoost algorithm for multi-class classification problems, we devise an AdaBoost architecture with a training algorithm that utilizes multi-class classifiers for its weak classifiers instead of series of binary classifiers. Experiments on a image classification problem using collected Caltech Image Database are preformed. The results show that the proposed AdaBoost architecture can reduce its training time while maintaining its classification accuracy competitive when compared to Adaboost.M2.

An Improvement of Mathematical Classification Method of Wallpapers and Its Application (벽지의 수학적 분류 방법의 개선 및 활용)

  • Shin, Hyunyong;Han, Inki;Na, Junyoung
    • East Asian mathematical journal
    • /
    • v.33 no.2
    • /
    • pp.123-147
    • /
    • 2017
  • This paper discusses and searches for mathematical analysis and efficient algorithm for types of wallpapers. We study some previous classification methods, develop a systematic process, and present some examples of determining types of wallpaper through our algorithm. Through this approach, we expect to introduce a mathematical perspective on relation between real life and mathematics.

Color Image Retrieval Using Block-based Classification (블록단위 특성분류를 이용한 컬러영상 검색)

  • 류명분;우석훈;박동권;원치선
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1996.06a
    • /
    • pp.63-66
    • /
    • 1996
  • In this paper, we propose a new content-based color image retrieval algorithm. The algorithm makes use of two features; colors as global features and block classification results as local features. More specifically, we obtain R, G, B color histograms and classify nonoverlapping small image blocks into texture, monotone, and various edges, then using these histograms and classification results were make a similarity measure. Experimental results show that retrieval rate of the proposed algorithm is higher than the previous method.

  • PDF

Learning Networks for Learning the Pattern Vectors causing Classification Error (분류오차유발 패턴벡터 학습을 위한 학습네트워크)

  • Lee Yong-Gu;Choi Woo-Seung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.77-86
    • /
    • 2005
  • In this paper, we designed a learning algorithm of LVQ that extracts classification errors and learns ones and improves classification performance. The proposed LVQ learning algorithm is the learning Networks which is use SOM to learn initial reference vectors and out-star learning algorithm to determine the class of the output neurons of LVQ. To extract pattern vectors which cause classification errors, we proposed the error-cause condition, which uses that condition and constructed the pattern vector space which consists of the input pattern vectors that cause the classification errors and learned these pattern vectors , and improved performance of the pattern classification. To prove the performance of the proposed learning algorithm, the simulation is performed by using training vectors and test vectors that are Fisher' Iris data and EMG data, and classification performance of the proposed learning method is compared with ones of the conventional LVQ, and it was a confirmation that the proposed learning method is more successful classification than the conventional classification.

  • PDF

Power Efficient Classification Method for Sensor Nodes in BSN Based ECG Monitoring System

  • Zeng, Min;Lee, Jeong-A
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.35 no.9B
    • /
    • pp.1322-1329
    • /
    • 2010
  • As body sensor network (BSN) research becomes mature, the need for managing power consumption of sensor nodes has become evident since most of the applications are designed for continuous monitoring. Real time Electrocardiograph (ECG) analysis on sensor nodes is proposed as an optimal choice for saving power consumption by reducing data transmission overhead. Smart sensor nodes with the ability to categorize lately detected ECG cycles communicate with base station only when ECG cycles are classified as abnormal. In this paper, ECG classification algorithms are described, which categorize detected ECG cycles as normal or abnormal, or even more specific cardiac diseases. Our Euclidean distance (ED) based classification method is validated to be most power efficient and very accurate in determining normal or abnormal ECG cycles. A close comparison of power efficiency and classification accuracy between our ED classification algorithm and generalized linear model (GLM) based classification algorithm is provided. Through experiments we show that, CPU cycle power consumption of ED based classification algorithm can be reduced by 31.21% and overall power consumption can be reduced by 13.63% at most when compared with GLM based method. The accuracy of detecting NSR, APC, PVC, SVT, VT, and VF using GLM based method range from 55% to 99% meanwhile, we show that the accuracy of detecting normal and abnormal ECG cycles using our ED based method is higher than 86%.