• Title/Summary/Keyword: Binary Classifier

Search Result 133, Processing Time 0.024 seconds

Binary Classification of Hypertensive Retinopathy Using Deep Dense CNN Learning

  • Mostafa E.A., Ibrahim;Qaisar, Abbas
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.12
    • /
    • pp.98-106
    • /
    • 2022
  • A condition of the retina known as hypertensive retinopathy (HR) is connected to high blood pressure. The severity and persistence of hypertension are directly correlated with the incidence of HR. To avoid blindness, it is essential to recognize and assess HR as soon as possible. Few computer-aided systems are currently available that can diagnose HR issues. On the other hand, those systems focused on gathering characteristics from a variety of retinopathy-related HR lesions and categorizing them using conventional machine-learning algorithms. Consequently, for limited applications, significant and complicated image processing methods are necessary. As seen in recent similar systems, the preciseness of classification is likewise lacking. To address these issues, a new CAD HR-diagnosis system employing the advanced Deep Dense CNN Learning (DD-CNN) technology is being developed to early identify HR. The HR-diagnosis system utilized a convolutional neural network that was previously trained as a feature extractor. The statistical investigation of more than 1400 retinography images is undertaken to assess the accuracy of the implemented system using several performance metrics such as specificity (SP), sensitivity (SE), area under the receiver operating curve (AUC), and accuracy (ACC). On average, we achieved a SE of 97%, ACC of 98%, SP of 99%, and AUC of 0.98. These results indicate that the proposed DD-CNN classifier is used to diagnose hypertensive retinopathy.

Fake News Detector using Machine Learning Algorithms

  • Diaa Salama;yomna Ibrahim;Radwa Mostafa;Abdelrahman Tolba;Mariam Khaled;John Gerges;Diaa Salama
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.7
    • /
    • pp.195-201
    • /
    • 2024
  • With the Covid-19(Corona Virus) spread all around the world, people are using this propaganda and the desperate need of the citizens to know the news about this mysterious virus by spreading fake news. Some Countries arrested people who spread fake news about this, and others made them pay a fine. And since Social Media has become a significant source of news, .there is a profound need to detect these fake news. The main aim of this research is to develop a web-based model using a combination of machine learning algorithms to detect fake news. The proposed model includes an advanced framework to identify tweets with fake news using Context Analysis; We assumed that Natural Language Processing(NLP) wouldn't be enough alone to make context analysis as Tweets are usually short and do not follow even the most straightforward syntactic rules, so we used Tweets Features as several retweets, several likes and tweet-length we also added statistical credibility analysis for Twitter users. The proposed algorithms are tested on four different benchmark datasets. And Finally, to get the best accuracy, we combined two of the best algorithms used SVM ( which is widely accepted as baseline classifier, especially with binary classification problems ) and Naive Base.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Binary classification of bolts with anti-loosening coating using transfer learning-based CNN (전이학습 기반 CNN을 통한 풀림 방지 코팅 볼트 이진 분류에 관한 연구)

  • Noh, Eunsol;Yi, Sarang;Hong, Seokmoo
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.2
    • /
    • pp.651-658
    • /
    • 2021
  • Because bolts with anti-loosening coatings are used mainly for joining safety-related components in automobiles, accurate automatic screening of these coatings is essential to detect defects efficiently. The performance of the convolutional neural network (CNN) used in a previous study [Identification of bolt coating defects using CNN and Grad-CAM] increased with increasing number of data for the analysis of image patterns and characteristics. On the other hand, obtaining the necessary amount of data for coated bolts is difficult, making training time-consuming. In this paper, resorting to the same VGG16 model as in a previous study, transfer learning was applied to decrease the training time and achieve the same or better accuracy with fewer data. The classifier was trained, considering the number of training data for this study and its similarity with ImageNet data. In conjunction with the fully connected layer, the highest accuracy was achieved (95%). To enhance the performance further, the last convolution layer and the classifier were fine-tuned, which resulted in a 2% increase in accuracy (97%). This shows that the learning time can be reduced by transfer learning and fine-tuning while maintaining a high screening accuracy.

Frequency-Cepstral Features for Bag of Words Based Acoustic Context Awareness (Bag of Words 기반 음향 상황 인지를 위한 주파수-캡스트럴 특징)

  • Park, Sang-Wook;Choi, Woo-Hyun;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.4
    • /
    • pp.248-254
    • /
    • 2014
  • Among acoustic signal analysis tasks, acoustic context awareness is one of the most formidable tasks in terms of complexity since it requires sophisticated understanding of individual acoustic events. In conventional context awareness methods, individual acoustic event detection or recognition is employed to generate a relevant decision on the impending context. However this approach may produce poorly performing decision results in practical situations due to the possibility of events occurring simultaneously or the acoustically similar events that are difficult to distinguish with each other. Particularly, the babble noise acoustic event occurring at a bus or subway environment may create confusion to context awareness task since babbling is similar in any environment. Therefore in this paper, a frequency-cepstral feature vector is proposed to mitigate the confusion problem during the situation awareness task of binary decisions: bus or metro. By employing the Support Vector Machine (SVM) as the classifier, the proposed feature vector scheme is shown to produce better performance than the conventional scheme.

Multiple Discriminative DNNs for I-Vector Based Open-Set Language Recognition (I-벡터 기반 오픈세트 언어 인식을 위한 다중 판별 DNN)

  • Kang, Woo Hyun;Cho, Won Ik;Kang, Tae Gyoon;Kim, Nam Soo
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.8
    • /
    • pp.958-964
    • /
    • 2016
  • In this paper, we propose an i-vector based language recognition system to identify the spoken language of the speaker, which uses multiple discriminative deep neural network (DNN) models analogous to the multi-class support vector machine (SVM) classification system. The proposed model was trained and tested using the i-vectors included in the NIST 2015 i-vector Machine Learning Challenge database, and shown to outperform the conventional language recognition methods such as cosine distance, SVM and softmax NN classifier in open-set experiments.

Robustness of Face Recognition to Variations of Illumination on Mobile Devices Based on SVM

  • Nam, Gi-Pyo;Kang, Byung-Jun;Park, Kang-Ryoung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.4 no.1
    • /
    • pp.25-44
    • /
    • 2010
  • With the increasing popularity of mobile devices, it has become necessary to protect private information and content in these devices. Face recognition has been favored over conventional passwords or security keys, because it can be easily implemented using a built-in camera, while providing user convenience. However, because mobile devices can be used both indoors and outdoors, there can be many illumination changes, which can reduce the accuracy of face recognition. Therefore, we propose a new face recognition method on a mobile device robust to illumination variations. This research makes the following four original contributions. First, we compared the performance of face recognition with illumination variations on mobile devices for several illumination normalization procedures suitable for mobile devices with low processing power. These include the Retinex filter, histogram equalization and histogram stretching. Second, we compared the performance for global and local methods of face recognition such as PCA (Principal Component Analysis), LNMF (Local Non-negative Matrix Factorization) and LBP (Local Binary Pattern) using an integer-based kernel suitable for mobile devices having low processing power. Third, the characteristics of each method according to the illumination va iations are analyzed. Fourth, we use two matching scores for several methods of illumination normalization, Retinex and histogram stretching, which show the best and $2^{nd}$ best performances, respectively. These are used as the inputs of an SVM (Support Vector Machine) classifier, which can increase the accuracy of face recognition. Experimental results with two databases (data collected by a mobile device and the AR database) showed that the accuracy of face recognition achieved by the proposed method was superior to that of other methods.

Rank-based Multiclass Gene Selection for Cancer Classification with Naive Bayes Classifiers based on Gene Expression Profiles (나이브 베이스 분류기를 이용한 유전발현 데이타기반 암 분류를 위한 순위기반 다중클래스 유전자 선택)

  • Hong, Jin-Hyuk;Cho, Sung-Bae
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.8
    • /
    • pp.372-377
    • /
    • 2008
  • Multiclass cancer classification has been actively investigated based on gene expression profiles, where it determines the type of cancer by analyzing the large amount of gene expression data collected by the DNA microarray technology. Since gene expression data include many genes not related to a target cancer, it is required to select informative genes in order to obtain highly accurate classification. Conventional rank-based gene selection methods often use ideal marker genes basically devised for binary classification, so it is difficult to directly apply them to multiclass classification. In this paper, we propose a novel method for multiclass gene selection, which does not use ideal marker genes but directly analyzes the distribution of gene expression. It measures the class-discriminability by discretizing gene expression levels into several regions and analyzing the frequency of training samples for each region, and then classifies samples by using the naive Bayes classifier. We have demonstrated the usefulness of the proposed method for various representative benchmark datasets of multiclass cancer classification.

Audio Segmentation and Classification Using Support Vector Machine and Fuzzy C-Means Clustering Techniques (서포트 벡터 머신과 퍼지 클러스터링 기법을 이용한 오디오 분할 및 분류)

  • Nguyen, Ngoc;Kang, Myeong-Su;Kim, Cheol-Hong;Kim, Jong-Myon
    • The KIPS Transactions:PartB
    • /
    • v.19B no.1
    • /
    • pp.19-26
    • /
    • 2012
  • The rapid increase of information imposes new demands of content management. The purpose of automatic audio segmentation and classification is to meet the rising need for efficient content management. With this reason, this paper proposes a high-accuracy algorithm that segments audio signals and classifies them into different classes such as speech, music, silence, and environment sounds. The proposed algorithm utilizes support vector machine (SVM) to detect audio-cuts, which are boundaries between different kinds of sounds using the parameter sequence. We then extract feature vectors that are composed of statistical data and they are used as an input of fuzzy c-means (FCM) classifier to partition audio-segments into different classes. To evaluate segmentation and classification performance of the proposed SVM-FCM based algorithm, we consider precision and recall rates for segmentation and classification accuracy for classification. Furthermore, we compare the proposed algorithm with other methods including binary and FCM classifiers in terms of segmentation performance. Experimental results show that the proposed algorithm outperforms other methods in both precision and recall rates.

EMD based Cardiac Arrhythmia Classification using Multi-class SVM (다중 클래스 SVM을 이용한 EMD 기반의 부정맥 신호 분류)

  • Lee, Geum-Boon;Cho, Beom-Joon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.1
    • /
    • pp.16-22
    • /
    • 2010
  • Electrocardiogram(ECG) analysis and arrhythmia recognition are critical for diagnosis and treatment of ill patients. Cardiac arrhythmia is a condition in which heart beat may be irregular and presents a serious threat to the patient recovering from ventricular tachycardia (VT) and ventricular fibrillation (VF). Other arrhythmias like atrial premature contraction (APC), Premature ventricular contraction (PVC) and superventricular tachycardia (SVT) are important in diagnosing the heart diseases. This paper presented new method to classify various arrhythmias contrary to other techniques which are limited to only two or three arrhythmias. ECG is decomposed into Intrinsic Mode Functions (IMFs) by Empirical Mode Decomposition (EMD). Burg algorithm was performed on IMFs to obtain AR coefficients which can reduce the dimension of feature vector and utilized as Multi-class SVM inputs which is basically extended from binary SVM. We chose optimal parameters for SVM classifier, applied to arrhythmias classification and achieved the accuracies of detecting NSR, APC, PVC, SVT, VT and VP were 96.8% to 99.5%. The results showed that EMD was useful for the preprocessing and feature extraction and multi-class SVM for classification of cardiac arrhythmias, with high usefulness.