• Title/Summary/Keyword: Classification Accuracy Test

Search Result 396, Processing Time 0.025 seconds

Modality-Based Sentence-Final Intonation Prediction for Korean Conversational-Style Text-to-Speech Systems

  • Oh, Seung-Shin;Kim, Sang-Hun
    • ETRI Journal
    • /
    • v.28 no.6
    • /
    • pp.807-810
    • /
    • 2006
  • This letter presents a prediction model for sentence-final intonations for Korean conversational-style text-to-speech systems in which we introduce the linguistic feature of 'modality' as a new parameter. Based on their function and meaning, we classify tonal forms in speech data into tone types meaningful for speech synthesis and use the result of this classification to build our prediction model using a tree structured classification algorithm. In order to show that modality is more effective for the prediction model than features such as sentence type or speech act, an experiment is performed on a test set of 970 utterances with a training set of 3,883 utterances. The results show that modality makes a higher contribution to the determination of sentence-final intonation than sentence type or speech act, and that prediction accuracy improves up to 25% when the feature of modality is introduced.

  • PDF

The Threat List Acquisition Method in an Engagement Area using the Support Vector Machines (SVM을 이용한 교전영역 내 위협목록 획득방법)

  • Koh, Hyeseung
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.19 no.2
    • /
    • pp.236-243
    • /
    • 2016
  • This paper presents a threat list acquisition method in an engagement area using the support vector machines (SVM). The proposed method consists of track creation, track estimation, track feature extraction, and threat list classification. To classify the threat track robustly, dynamic track estimation and pattern recognition algorithms are used. Dynamic tracks are estimated accurately by approximating a track movement using position, velocity and time. After track estimation, track features are extracted from the track information, and used to classify threat list. Experimental results showed that the threat list acquisition method in the engagement area achieved about 95 % accuracy rate for whole test tracks when using the SVM classifier. In case of improving the real-time process through further studies, it can be expected to apply the fire control systems.

A Study on Classifications of Useful Customer Reviews by Applying Text Mining Approach (텍스트 마이닝을 활용한 고객 리뷰의 유용성 지수 개선에 관한 연구)

  • Lee, Hong Joo
    • Journal of Information Technology Services
    • /
    • v.14 no.4
    • /
    • pp.159-169
    • /
    • 2015
  • Customer reviews are one of the important sources for purchase decision makings in online stores. Online stores have tried to provide useful reviews in product pages to customers. To assess the usefulness of customer reviews before other users have voted enough on the reviews, diverse aspects of reviews were utilized in prevous studies. Style and semantic information were utilized in many studies. This study aims to test diverse alogrithms and datasets for identifying a proper classification method and threshold to classify useful reviews. In particular, most researches utilized ratio type helpfulness index as Amazon.com used. However, there is another type of usefulness index utilized in TripAdviser.com or Yelp.com, count type helpfulness index. There was no proper threshold to classify useful reviews yet for count type helpfulness index. This study used reivews and their usefulness votes on restaurnats from Yelp.com to devise diverse datasets and applied text mining approaches to classify useful reviews. Random Forest, SVM, and GLMNET showed the greater values of accuracy than other approaches.

Discrimination model using denoising autoencoder-based majority vote classification for reducing false alarm rate

  • Heonyong Lee;Kyungtak Yu;Shiu Kim
    • Nuclear Engineering and Technology
    • /
    • v.55 no.10
    • /
    • pp.3716-3724
    • /
    • 2023
  • Loose parts monitoring and detecting alarm type in real Nuclear Power Plant have challenges such as background noise, insufficient alarm data, and difficulty of distinction between alarm data that occur during start and stop. Although many signal processing methods and alarm determination algorithms have been developed, it is not easy to determine valid alarm and extract the meaning data from alarm signal including background noise. To address these issues, this paper proposes a denoising autoencoder-based majority vote classification. Training and test data are prepared by acquiring alarm data from real NPP and simulation facility for data augmentation, and noisy data is reproduced by adding Gaussian noise. Using DAEs with 3, 5, 7, and 9 layers, features are extracted for each model and classified into neural networks. Finally, the results obtained from each DAE are classified by majority voting. Also, through comparison with other methods, the accuracy and the false alarm rate are compared, and the excellence of the proposed method is confirmed.

Anomaly-Based Network Intrusion Detection: An Approach Using Ensemble-Based Machine Learning Algorithm

  • Kashif Gul Chachar;Syed Nadeem Ahsan
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.1
    • /
    • pp.107-118
    • /
    • 2024
  • With the seamless growth of the technology, network usage requirements are expanding day by day. The majority of electronic devices are capable of communication, which strongly requires a secure and reliable network. Network-based intrusion detection systems (NIDS) is a new method for preventing and alerting computers and networks from attacks. Machine Learning is an emerging field that provides a variety of ways to implement effective network intrusion detection systems (NIDS). Bagging and Boosting are two ensemble ML techniques, renowned for better performance in the learning and classification process. In this paper, the study provides a detailed literature review of the past work done and proposed a novel ensemble approach to develop a NIDS system based on the voting method using bagging and boosting ensemble techniques. The test results demonstrate that the ensemble of bagging and boosting through voting exhibits the highest classification accuracy of 99.98% and a minimum false positive rate (FPR) on both datasets. Although the model building time is average which can be a tradeoff by processor speed.

A Study on Improving the Accuracy of Medical Images Classification Using Data Augmentation

  • Cheon-Ho Park;Min-Guan Kim;Seung-Zoon Lee;Jeongil Choi
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.12
    • /
    • pp.167-174
    • /
    • 2023
  • This paper attempted to improve the accuracy of the colorectal cancer diagnosis model using image data augmentation in convolutional neural network. Image data augmentation was performed by flipping, rotation, translation, shearing and zooming with basic image manipulation method. This study split 4000 training data and 1000 test data for 5000 image data held, the model is learned by adding 4000 and 8000 images by image data augmentation technique to 4000 training data. The evaluation results showed that the clasification accuracy for 4000, 8000, and 12,000 training data were 85.1%, 87.0%, and 90.2%, respectively, and the improvement effect depending on the increase of image data was confirmed.

A Comparative Study on the Possibility of Land Cover Classification of the Mosaic Images on the Korean Peninsula (한반도 모자이크 영상의 토지피복분류 활용 가능성 탐색을 위한 비교 연구)

  • Moon, Jiyoon;Lee, Kwang Jae
    • Korean Journal of Remote Sensing
    • /
    • v.35 no.6_4
    • /
    • pp.1319-1326
    • /
    • 2019
  • The KARI(Korea Aerospace Research Institute) operates the government satellite information application consultation to cope with ever-increasing demand for satellite images in the public sector, and carries out various support projects including the generation and provision of mosaic images on the Korean Peninsula every year to enhance user convenience and promote the use of satellite images. In particular, the government has wanted to increase the utilization of mosaic images on the Korean Peninsula and seek to classify and update mosaic images so that users can use them in their businesses easily. However, it is necessary to test and verify whether the classification results of the mosaic images can be utilized in the field since the original spectral information is distorted during pan-sharpening and color balancing, and there is a limitation that only R, G, and B bands are provided. Therefore, in this study, the reliability of the classification result of the mosaic image was compared to the result of KOMPSAT-3 image. The study found that the accuracy of the classification result of KOMPSAT-3 image was between 81~86% (overall accuracy is about 85%), while the accuracy of the classification result of mosaic image was between 69~72% (overall accuracy is about 72%). This phenomenon is interpreted not only because of the distortion of the original spectral information through pan-sharpening and mosaic processes, but also because NDVI and NDWI information were extracted from KOMPSAT-3 image rather than from the mosaic image, as only three color bands(R, G, B) were provided. Although it is deemed inadequate to distribute classification results extracted from mosaic images at present, it is believed that it will be necessary to explore ways to minimize the distortion of spectral information when making mosaic images and to develop classification techniques suitable for mosaic images as well as the provision of NIR band information. In addition, it is expected that the utilization of images with limited spectral information could be increased in the future if related research continues, such as the comparative analysis of classification results by geomorphological characteristics and the development of machine learning methods for image classification by objects of interest.

Comparative Evaluation of Chest Image Pneumonia based on Learning Rate Application (학습률 적용에 따른 흉부영상 폐렴 유무 분류 비교평가)

  • Kim, Ji-Yul;Ye, Soo-Young
    • Journal of the Korean Society of Radiology
    • /
    • v.16 no.5
    • /
    • pp.595-602
    • /
    • 2022
  • This study tried to suggest the most efficient learning rate for accurate and efficient automatic diagnosis of medical images for chest X-ray pneumonia images using deep learning. After setting the learning rates to 0.1, 0.01, 0.001, and 0.0001 in the Inception V3 deep learning model, respectively, deep learning modeling was performed three times. And the average accuracy and loss function value of verification modeling, and the metric of test modeling were set as performance evaluation indicators, and the performance was compared and evaluated with the average value of three times of the results obtained as a result of performing deep learning modeling. As a result of performance evaluation for deep learning verification modeling performance evaluation and test modeling metric, modeling with a learning rate of 0.001 showed the highest accuracy and excellent performance. For this reason, in this paper, it is recommended to apply a learning rate of 0.001 when classifying the presence or absence of pneumonia on chest X-ray images using a deep learning model. In addition, it was judged that when deep learning modeling through the application of the learning rate presented in this paper could play an auxiliary role in the classification of the presence or absence of pneumonia on chest X-ray images. In the future, if the study of classification for diagnosis and classification of pneumonia using deep learning continues, the contents of this thesis research can be used as basic data, and furthermore, it is expected that it will be helpful in selecting an efficient learning rate in classifying medical images using artificial intelligence.

Research on Oriental Medicine Diagnosis and Classification System by Using Neck Pain Questionnaire (경항통 설문지를 이용한 한의학적 진단 및 분류체계에 관한 연구)

  • Song, In;Lee, Geon-Mok;Hong, Kwon-Eui
    • Journal of Acupuncture Research
    • /
    • v.28 no.3
    • /
    • pp.85-100
    • /
    • 2011
  • Objectives : The purpose of this thesis is to help the preparation of oriental medicine clinical guidelines for drawing up the standards of oriental medicine demonstration and diagnosis classification about the neck pain. Methods : Statistical analysis about Gyeonghangtong(頸項痛), Nakchim(落枕), Sagyeong(斜頸), Hanggang (項强) classified experts' opinions about neck pain patients by Delphi method is conducted by using oriental medicine diagnosis questionnaire. The result was classified by using linear discriminant analysis (LDA), diagonal linear discriminant analysis (DLDA), diagonal quadratic discriminant analysis (DQDA), K-nearest neighbor classification (KNN), classification and regression trees (CART), support vector machines (SVM). Results : The results are summarized as follows. 1. The result analyzed by using LDA has a hit rate of 84.47% in comparison with the original diagnosis. 2. High hit rate was shown when the test for three categories such as Gyeonghangtong and Hanggang category, Sagyeong caterogy and Nakchim caterogy was conducted. 3. The result analyzed by using DLDA has a hit rate of 58.25% in comparison with the original diagnosis. The result analyzed by using DQDA has a accuracy of 57.28% in comparison with the original diagnosis. 4. The result analyzed by using KNN has a hit rate of 69.90% in comparison with the original diagnosis. 5. The result analyzed by using CART has a hit rate of 69.60% in comparison with the original diagnosis. There was a hit rate of 70.87% When the test of selected 8 significant questions based on analysis of variance was performed. 6. The result analyzed by using SVM has a hit rate of 80.58% in comparison with the original diagnosis. Conclusions : Statistical analysis using oriental medicine diagnosis questionnaire on neck pain generally turned out to have a significant result.

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.