• 제목/요약/키워드: CNN Model

검색결과 963건 처리시간 0.034초

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

  • Eom, Youngsik;Bang, Junseong
    • Journal of information and communication convergence engineering
    • /
    • 제19권3호
    • /
    • pp.148-154
    • /
    • 2021
  • With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.

정확도 향상을 위한 CNN-LSTM 기반 풍력발전 예측 시스템 (CNN-LSTM based Wind Power Prediction System to Improve Accuracy)

  • 박래진;강성우;이재형;정승민
    • 신재생에너지
    • /
    • 제18권2호
    • /
    • pp.18-25
    • /
    • 2022
  • In this study, we propose a wind power generation prediction system that applies machine learning and data mining to predict wind power generation. This system increases the utilization rate of new and renewable energy sources. For time-series data, the data set was established by measuring wind speed, wind generation, and environmental factors influencing the wind speed. The data set was pre-processed so that it could be applied appropriately to the model. The prediction system applied the CNN (Convolutional Neural Network) to the data mining process and then used the LSTM (Long Short-Term Memory) to learn and make predictions. The preciseness of the proposed system is verified by comparing the prediction data with the actual data, according to the presence or absence of data mining in the model of the prediction system.

Railway sleeper crack recognition based on edge detection and CNN

  • Wang, Gang;Xiang, Jiawei
    • Smart Structures and Systems
    • /
    • 제28권6호
    • /
    • pp.779-789
    • /
    • 2021
  • Cracks in railway sleeper are an inevitable condition and has a significant influence on the safety of railway system. Although the technology of railway sleeper condition monitoring using machine learning (ML) models has been widely applied, the crack recognition accuracy is still in need of improvement. In this paper, a two-stage method using edge detection and convolutional neural network (CNN) is proposed to reduce the burden of computing for detecting cracks in railway sleepers with high accuracy. In the first stage, the edge detection is carried out by using the 3×3 neighborhood range algorithm to find out the possible crack areas, and a series of mathematical morphology operations are further used to eliminate the influence of noise targets to the edge detection results. In the second stage, a CNN model is employed to classify the results of edge detection. Through the analysis of abundant images of sleepers with cracks, it is proved that the cracks detected by the neighborhood range algorithm are superior to those detected by Sobel and Canny algorithms, which can be classified by proposed CNN model with high accuracy.

Sentiment Analysis to Evaluate Different Deep Learning Approaches

  • Sheikh Muhammad Saqib ;Tariq Naeem
    • International Journal of Computer Science & Network Security
    • /
    • 제23권11호
    • /
    • pp.83-92
    • /
    • 2023
  • The majority of product users rely on the reviews that are posted on the appropriate website. Both users and the product's manufacturer could benefit from these reviews. Daily, thousands of reviews are submitted; how is it possible to read them all? Sentiment analysis has become a critical field of research as posting reviews become more and more common. Machine learning techniques that are supervised, unsupervised, and semi-supervised have worked very hard to harvest this data. The complicated and technological area of feature engineering falls within machine learning. Using deep learning, this tedious process may be completed automatically. Numerous studies have been conducted on deep learning models like LSTM, CNN, RNN, and GRU. Each model has employed a certain type of data, such as CNN for pictures and LSTM for language translation, etc. According to experimental results utilizing a publicly accessible dataset with reviews for all of the models, both positive and negative, and CNN, the best model for the dataset was identified in comparison to the other models, with an accuracy rate of 81%.

1D-CNN-LSTM Hybrid-Model-Based Pet Behavior Recognition through Wearable Sensor Data Augmentation

  • Hyungju Kim;Nammee Moon
    • Journal of Information Processing Systems
    • /
    • 제20권2호
    • /
    • pp.159-172
    • /
    • 2024
  • The number of healthcare products available for pets has increased in recent times, which has prompted active research into wearable devices for pets. However, the data collected through such devices are limited by outliers and missing values owing to the anomalous and irregular characteristics of pets. Hence, we propose pet behavior recognition based on a hybrid one-dimensional convolutional neural network (CNN) and long short- term memory (LSTM) model using pet wearable devices. An Arduino-based pet wearable device was first fabricated to collect data for behavior recognition, where gyroscope and accelerometer values were collected using the device. Then, data augmentation was performed after replacing any missing values and outliers via preprocessing. At this time, the behaviors were classified into five types. To prevent bias from specific actions in the data augmentation, the number of datasets was compared and balanced, and CNN-LSTM-based deep learning was performed. The five subdivided behaviors and overall performance were then evaluated, and the overall accuracy of behavior recognition was found to be about 88.76%.

얼굴 열화상 기반 감정인식을 위한 CNN 학습전략 (Divide and Conquer Strategy for CNN Model in Facial Emotion Recognition based on Thermal Images)

  • 이동환;유장희
    • 한국소프트웨어감정평가학회 논문지
    • /
    • 제17권2호
    • /
    • pp.1-10
    • /
    • 2021
  • 감정인식은 응용 분야의 다양성으로 많은 연구가 이루어지고 있는 기술이며, RGB 영상은 물론 열화상을 이용한 감정인식의 필요성도 높아지고 있다. 열화상의 경우는 RGB 영상과 비교해 조명 문제에 거의 영향을 받지 않는 장점이 있으나 낮은 해상도로 성능 높은 인식 기술을 필요로 한다. 본 논문에서는 얼굴 열화상 기반 감정인식의 성능을 높이기 위한 Divide and Conquer 기반의 CNN 학습전략을 제안하였다. 제안된 방법은 먼저 분류가 어려운 유사 감정 클래스를 confusion matrix 분석을 통해 동일 클래스 군으로 분류하도록 학습시키고, 다음으로 동일 클래스 군으로 분류된 감정 군을 실제 감정으로 다시 인식하도록 문제를 나누어서 해결하는 방법을 사용하였다. 실험을 통하여, 제안된 학습전략이 제시된 모든 감정을 하나의 CNN 모델에서 인식하는 경우보다 모든 실험에서 높은 인식성능을 보이는 것을 확인하였다.

Study on Real-time Detection Using Odor Data Based on Mixed Neural Network of CNN and LSTM

  • Gi-Seok Lee;Sang-Hyun Lee
    • International Journal of Advanced Culture Technology
    • /
    • 제11권1호
    • /
    • pp.325-331
    • /
    • 2023
  • In this paper, we propose a mixed neural network structure of CNN and LSTM that can be used to detect or predict odor occurrence, which is most required in manufacturing industry or real life, using odor complex sensors. In addition, the proposed learning model uses a complex odor sensor to receive four types of data such as hydrogen sulfide, ammonia, benzene, and toluene in real time, and applies this data to an inference model to detect and predict odor conditions. The proposed model evaluated the prediction accuracy of the learning model through performance indicators according to accuracy, and the evaluation result showed an average performance of 94% or more.

One Step Measurements of hippocampal Pure Volumes from MRI Data Using an Ensemble Model of 3-D Convolutional Neural Network

  • Basher, Abol;Ahmed, Samsuddin;Jung, Ho Yub
    • 스마트미디어저널
    • /
    • 제9권2호
    • /
    • pp.22-32
    • /
    • 2020
  • The hippocampal volume atrophy is known to be linked with neuro-degenerative disorders and it is also one of the most important early biomarkers for Alzheimer's disease detection. The measurements of hippocampal pure volumes from Magnetic Resonance Imaging (MRI) is a crucial task and state-of-the-art methods require a large amount of time. In addition, the structural brain development is investigated using MRI data, where brain morphometry (e.g. cortical thickness, volume, surface area etc.) study is one of the significant parts of the analysis. In this study, we have proposed a patch-based ensemble model of 3-D convolutional neural network (CNN) to measure the hippocampal pure volume from MRI data. The 3-D patches were extracted from the volumetric MRI scans to train the proposed 3-D CNN models. The trained models are used to construct the ensemble 3-D CNN model and the aggregated model predicts the pure volume in one-step in the test phase. Our approach takes only 5 seconds to estimate the volumes from an MRI scan. The average errors for the proposed ensemble 3-D CNN model are 11.7±8.8 (error%±STD) and 12.5±12.8 (error%±STD) for the left and right hippocampi of 65 test MRI scans, respectively. The quantitative study on the predicted volumes over the ground truth volumes shows that the proposed approach can be used as a proxy.

영상기반 콘크리트 균열 탐지 딥러닝 모델의 유형별 성능 비교 (A Comparative Study on Performance of Deep Learning Models for Vision-based Concrete Crack Detection according to Model Types)

  • 김병현;김건순;진수민;조수진
    • 한국안전학회지
    • /
    • 제34권6호
    • /
    • pp.50-57
    • /
    • 2019
  • In this study, various types of deep learning models that have been proposed recently are classified according to data input / output types and analyzed to find the deep learning model suitable for constructing a crack detection model. First the deep learning models are classified into image classification model, object segmentation model, object detection model, and instance segmentation model. ResNet-101, DeepLab V2, Faster R-CNN, and Mask R-CNN were selected as representative deep learning model of each type. For the comparison, ResNet-101 was implemented for all the types of deep learning model as a backbone network which serves as a main feature extractor. The four types of deep learning models were trained with 500 crack images taken from real concrete structures and collected from the Internet. The four types of deep learning models showed high accuracy above 94% during the training. Comparative evaluation was conducted using 40 images taken from real concrete structures. The performance of each type of deep learning model was measured using precision and recall. In the experimental result, Mask R-CNN, an instance segmentation deep learning model showed the highest precision and recall on crack detection. Qualitative analysis also shows that Mask R-CNN could detect crack shapes most similarly to the real crack shapes.

X-ray 이물검출기의 이물 검출 향상을 위한 딥러닝 방법 (Deep Learning Method for Improving Contamination Dectection of Xoray Inspection System)

  • 임병휘;정승수;유윤섭
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2021년도 춘계학술대회
    • /
    • pp.460-462
    • /
    • 2021
  • 식품은 기본적으로 영양성과 안전성을 반드시 갖추어야 한다. 최근에 식품의 안정성이 의심이 되는 안산의 한 유치원에서 식중독성 유증상자가 다수 발생하였다. 그래서 식품의 안전성은 더욱 요구되는 사항이다. 본 논문에서는 식품의 안전성을 확보하기 위한 이물검출기의 딥러닝모델을 통해 검출율을 향상시키는 방법을 제안한다. 제안방법으로는 CNN(convolution neural network), Faster R-CNN(region convolution neural network)의 네트워크를 통해 학습하고 정상과 이물제품의 영상을 테스트 한다. 딥러닝 모델을 통해 테스트한 결과 기존 이물검출기의 알고리즘에 Faster R-CNN을 병행한 방법이 다른 방법보다 검출율이 좋은 성능을 보였다.

  • PDF