• Title/Summary/Keyword: CNN model

Search Result 1,011, Processing Time 0.03 seconds

A Two-Stage Learning Method of CNN and K-means RGB Cluster for Sentiment Classification of Images (이미지 감성분류를 위한 CNN과 K-means RGB Cluster 이-단계 학습 방안)

  • Kim, Jeongtae;Park, Eunbi;Han, Kiwoong;Lee, Junghyun;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.3
    • /
    • pp.139-156
    • /
    • 2021
  • The biggest reason for using a deep learning model in image classification is that it is possible to consider the relationship between each region by extracting each region's features from the overall information of the image. However, the CNN model may not be suitable for emotional image data without the image's regional features. To solve the difficulty of classifying emotion images, many researchers each year propose a CNN-based architecture suitable for emotion images. Studies on the relationship between color and human emotion were also conducted, and results were derived that different emotions are induced according to color. In studies using deep learning, there have been studies that apply color information to image subtraction classification. The case where the image's color information is additionally used than the case where the classification model is trained with only the image improves the accuracy of classifying image emotions. This study proposes two ways to increase the accuracy by incorporating the result value after the model classifies an image's emotion. Both methods improve accuracy by modifying the result value based on statistics using the color of the picture. When performing the test by finding the two-color combinations most distributed for all training data, the two-color combinations most distributed for each test data image were found. The result values were corrected according to the color combination distribution. This method weights the result value obtained after the model classifies an image's emotion by creating an expression based on the log function and the exponential function. Emotion6, classified into six emotions, and Artphoto classified into eight categories were used for the image data. Densenet169, Mnasnet, Resnet101, Resnet152, and Vgg19 architectures were used for the CNN model, and the performance evaluation was compared before and after applying the two-stage learning to the CNN model. Inspired by color psychology, which deals with the relationship between colors and emotions, when creating a model that classifies an image's sentiment, we studied how to improve accuracy by modifying the result values based on color. Sixteen colors were used: red, orange, yellow, green, blue, indigo, purple, turquoise, pink, magenta, brown, gray, silver, gold, white, and black. It has meaning. Using Scikit-learn's Clustering, the seven colors that are primarily distributed in the image are checked. Then, the RGB coordinate values of the colors from the image are compared with the RGB coordinate values of the 16 colors presented in the above data. That is, it was converted to the closest color. Suppose three or more color combinations are selected. In that case, too many color combinations occur, resulting in a problem in which the distribution is scattered, so a situation fewer influences the result value. Therefore, to solve this problem, two-color combinations were found and weighted to the model. Before training, the most distributed color combinations were found for all training data images. The distribution of color combinations for each class was stored in a Python dictionary format to be used during testing. During the test, the two-color combinations that are most distributed for each test data image are found. After that, we checked how the color combinations were distributed in the training data and corrected the result. We devised several equations to weight the result value from the model based on the extracted color as described above. The data set was randomly divided by 80:20, and the model was verified using 20% of the data as a test set. After splitting the remaining 80% of the data into five divisions to perform 5-fold cross-validation, the model was trained five times using different verification datasets. Finally, the performance was checked using the test dataset that was previously separated. Adam was used as the activation function, and the learning rate was set to 0.01. The training was performed as much as 20 epochs, and if the validation loss value did not decrease during five epochs of learning, the experiment was stopped. Early tapping was set to load the model with the best validation loss value. The classification accuracy was better when the extracted information using color properties was used together than the case using only the CNN architecture.

Concurrent Detection for Vehicles and Lanes Using Light-Weight Model of Multi-Task CNN (멀티 테스크 CNN의 경량화 모델을 이용한 차량 및 차선의 동시 검출)

  • Shin, Hyeon-Sik;Kim, Hyung-Won;Hong, Sang-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.3
    • /
    • pp.367-373
    • /
    • 2022
  • As deep learning-based autonomous driving technology develops, artificial intelligence models for various purposes have been studied. Based on these studies, several models were used simultaneously to develop autonomous driving systems. It can occur by increasing hardware resource consumption. We propose a multi-tasks model using a shared backbone to solve this problem. This can solve the increase in the number of backbones for using AI models. As a result, in the proposed lightweight model, the model parameters could be reduced by more than 50% compared to the existing model, and the speed could be improved. In addition, each lane can be classified through lane detection using the instance segmentation method. However, further research is needed on the decrease in accuracy compared to the existing model.

Assessing Techniques for Advancing Land Cover Classification Accuracy through CNN and Transformer Model Integration (CNN 모델과 Transformer 조합을 통한 토지피복 분류 정확도 개선방안 검토)

  • Woo-Dam SIM;Jung-Soo LEE
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.27 no.1
    • /
    • pp.115-127
    • /
    • 2024
  • This research aimed to construct models with various structures based on the Transformer module and to perform land cover classification, thereby examining the applicability of the Transformer module. For the classification of land cover, the Unet model, which has a CNN structure, was selected as the base model, and a total of four deep learning models were constructed by combining both the encoder and decoder parts with the Transformer module. During the training process of the deep learning models, the training was repeated 10 times under the same conditions to evaluate the generalization performance. The evaluation of the classification accuracy of the deep learning models showed that the Model D, which utilized the Transformer module in both the encoder and decoder structures, achieved the highest overall accuracy with an average of approximately 89.4% and a Kappa coefficient average of about 73.2%. In terms of training time, models based on CNN were the most efficient. however, the use of Transformer-based models resulted in an average improvement of 0.5% in classification accuracy based on the Kappa coefficient. It is considered necessary to refine the model by considering various variables such as adjusting hyperparameters and image patch sizes during the integration process with CNN models. A common issue identified in all models during the land cover classification process was the difficulty in detecting small-scale objects. To improve this misclassification phenomenon, it is deemed necessary to explore the use of high-resolution input data and integrate multidimensional data that includes terrain and texture information.

A Study on the Deep Learning-based Tree Species Classification by using High-resolution Orthophoto Images (고해상도 정사영상을 이용한 딥러닝 기반의 산림수종 분류에 관한 연구)

  • JANG, Kwangmin
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.24 no.3
    • /
    • pp.1-9
    • /
    • 2021
  • In this study, we evaluated the accuracy of deep learning-based tree species classification model trained by using high-resolution images. We selected five species classed, i.e., pine, birch, larch, korean pine, mongolian oak for classification. We created 5,000 datasets using high-resolution orthophoto and forest type map. CNN deep learning model is used to tree species classification. We divided training data, verification data, and test data by a 5:3:2 ratio of the datasets and used it for the learning and evaluation of the model. The overall accuracy of the model was 89%. The accuracy of each species were pine 95%, birch 89%, larch 80%, korean pine 86% and mongolian oak 98%.

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.3
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

CNN-LSTM based Wind Power Prediction System to Improve Accuracy (정확도 향상을 위한 CNN-LSTM 기반 풍력발전 예측 시스템)

  • Park, Rae-Jin;Kang, Sungwoo;Lee, Jaehyeong;Jung, Seungmin
    • New & Renewable Energy
    • /
    • v.18 no.2
    • /
    • pp.18-25
    • /
    • 2022
  • In this study, we propose a wind power generation prediction system that applies machine learning and data mining to predict wind power generation. This system increases the utilization rate of new and renewable energy sources. For time-series data, the data set was established by measuring wind speed, wind generation, and environmental factors influencing the wind speed. The data set was pre-processed so that it could be applied appropriately to the model. The prediction system applied the CNN (Convolutional Neural Network) to the data mining process and then used the LSTM (Long Short-Term Memory) to learn and make predictions. The preciseness of the proposed system is verified by comparing the prediction data with the actual data, according to the presence or absence of data mining in the model of the prediction system.

Railway sleeper crack recognition based on edge detection and CNN

  • Wang, Gang;Xiang, Jiawei
    • Smart Structures and Systems
    • /
    • v.28 no.6
    • /
    • pp.779-789
    • /
    • 2021
  • Cracks in railway sleeper are an inevitable condition and has a significant influence on the safety of railway system. Although the technology of railway sleeper condition monitoring using machine learning (ML) models has been widely applied, the crack recognition accuracy is still in need of improvement. In this paper, a two-stage method using edge detection and convolutional neural network (CNN) is proposed to reduce the burden of computing for detecting cracks in railway sleepers with high accuracy. In the first stage, the edge detection is carried out by using the 3×3 neighborhood range algorithm to find out the possible crack areas, and a series of mathematical morphology operations are further used to eliminate the influence of noise targets to the edge detection results. In the second stage, a CNN model is employed to classify the results of edge detection. Through the analysis of abundant images of sleepers with cracks, it is proved that the cracks detected by the neighborhood range algorithm are superior to those detected by Sobel and Canny algorithms, which can be classified by proposed CNN model with high accuracy.

Sentiment Analysis to Evaluate Different Deep Learning Approaches

  • Sheikh Muhammad Saqib ;Tariq Naeem
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.83-92
    • /
    • 2023
  • The majority of product users rely on the reviews that are posted on the appropriate website. Both users and the product's manufacturer could benefit from these reviews. Daily, thousands of reviews are submitted; how is it possible to read them all? Sentiment analysis has become a critical field of research as posting reviews become more and more common. Machine learning techniques that are supervised, unsupervised, and semi-supervised have worked very hard to harvest this data. The complicated and technological area of feature engineering falls within machine learning. Using deep learning, this tedious process may be completed automatically. Numerous studies have been conducted on deep learning models like LSTM, CNN, RNN, and GRU. Each model has employed a certain type of data, such as CNN for pictures and LSTM for language translation, etc. According to experimental results utilizing a publicly accessible dataset with reviews for all of the models, both positive and negative, and CNN, the best model for the dataset was identified in comparison to the other models, with an accuracy rate of 81%.

1D-CNN-LSTM Hybrid-Model-Based Pet Behavior Recognition through Wearable Sensor Data Augmentation

  • Hyungju Kim;Nammee Moon
    • Journal of Information Processing Systems
    • /
    • v.20 no.2
    • /
    • pp.159-172
    • /
    • 2024
  • The number of healthcare products available for pets has increased in recent times, which has prompted active research into wearable devices for pets. However, the data collected through such devices are limited by outliers and missing values owing to the anomalous and irregular characteristics of pets. Hence, we propose pet behavior recognition based on a hybrid one-dimensional convolutional neural network (CNN) and long short- term memory (LSTM) model using pet wearable devices. An Arduino-based pet wearable device was first fabricated to collect data for behavior recognition, where gyroscope and accelerometer values were collected using the device. Then, data augmentation was performed after replacing any missing values and outliers via preprocessing. At this time, the behaviors were classified into five types. To prevent bias from specific actions in the data augmentation, the number of datasets was compared and balanced, and CNN-LSTM-based deep learning was performed. The five subdivided behaviors and overall performance were then evaluated, and the overall accuracy of behavior recognition was found to be about 88.76%.

Effect of Input Data Video Interval and Input Data Image Similarity on Learning Accuracy in 3D-CNN

  • Kim, Heeil;Chung, Yeongjee
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.13 no.2
    • /
    • pp.208-217
    • /
    • 2021
  • 3D-CNN is one of the deep learning techniques for learning time series data. However, these three-dimensional learning can generate many parameters, requiring high performance or having a significant impact on learning speed. We will use these 3D-CNNs to learn hand gesture and find the parameters that showed the highest accuracy, and then analyze how the accuracy of 3D-CNN varies through input data changes without any structural changes in 3D-CNN. First, choose the interval of the input data. This adjusts the ratio of the stop interval to the gesture interval. Secondly, the corresponding interframe mean value is obtained by measuring and normalizing the similarity of images through interclass 2D cross correlation analysis. This experiment demonstrates that changes in input data affect learning accuracy without structural changes in 3D-CNN. In this paper, we proposed two methods for changing input data. Experimental results show that input data can affect the accuracy of the model.