• Title/Summary/Keyword: Convolutional neural networks (CNN)

Search Result 356, Processing Time 0.025 seconds

Development of Deep Learning Models for Multi-class Sentiment Analysis (딥러닝 기반의 다범주 감성분석 모델 개발)

  • Syaekhoni, M. Alex;Seo, Sang Hyun;Kwon, Young S.
    • Journal of Information Technology Services
    • /
    • v.16 no.4
    • /
    • pp.149-160
    • /
    • 2017
  • Sentiment analysis is the process of determining whether a piece of document, text or conversation is positive, negative, neural or other emotion. Sentiment analysis has been applied for several real-world applications, such as chatbot. In the last five years, the practical use of the chatbot has been prevailing in many field of industry. In the chatbot applications, to recognize the user emotion, sentiment analysis must be performed in advance in order to understand the intent of speakers. The specific emotion is more than describing positive or negative sentences. In light of this context, we propose deep learning models for conducting multi-class sentiment analysis for identifying speaker's emotion which is categorized to be joy, fear, guilt, sad, shame, disgust, and anger. Thus, we develop convolutional neural network (CNN), long short term memory (LSTM), and multi-layer neural network models, as deep neural networks models, for detecting emotion in a sentence. In addition, word embedding process was also applied in our research. In our experiments, we have found that long short term memory (LSTM) model performs best compared to convolutional neural networks and multi-layer neural networks. Moreover, we also show the practical applicability of the deep learning models to the sentiment analysis for chatbot.

Dynamic Adjustment of the Pruning Threshold in Deep Compression (Deep Compression의 프루닝 문턱값 동적 조정)

  • Lee, Yeojin;Park, Hanhoon
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.22 no.3
    • /
    • pp.99-103
    • /
    • 2021
  • Recently, convolutional neural networks (CNNs) have been widely utilized due to their outstanding performance in various computer vision fields. However, due to their computational-intensive and high memory requirements, it is difficult to deploy CNNs on hardware platforms that have limited resources, such as mobile devices and IoT devices. To address these limitations, a neural network compression research is underway to reduce the size of neural networks while maintaining their performance. This paper proposes a CNN compression technique that dynamically adjusts the thresholds of pruning, one of the neural network compression techniques. Unlike the conventional pruning that experimentally or heuristically sets the thresholds that determine the weights to be pruned, the proposed technique can dynamically find the optimal thresholds that prevent accuracy degradation and output the light-weight neural network in less time. To validate the performance of the proposed technique, the LeNet was trained using the MNIST dataset and the light-weight LeNet could be automatically obtained 1.3 to 3 times faster without loss of accuracy.

Enhancing Multimodal Emotion Recognition in Speech and Text with Integrated CNN, LSTM, and BERT Models (통합 CNN, LSTM, 및 BERT 모델 기반의 음성 및 텍스트 다중 모달 감정 인식 연구)

  • Edward Dwijayanto Cahyadi;Hans Nathaniel Hadi Soesilo;Mi-Hwa Song
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.617-623
    • /
    • 2024
  • Identifying emotions through speech poses a significant challenge due to the complex relationship between language and emotions. Our paper aims to take on this challenge by employing feature engineering to identify emotions in speech through a multimodal classification task involving both speech and text data. We evaluated two classifiers-Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM)-both integrated with a BERT-based pre-trained model. Our assessment covers various performance metrics (accuracy, F-score, precision, and recall) across different experimental setups). The findings highlight the impressive proficiency of two models in accurately discerning emotions from both text and speech data.

Automatic Wood Species Identification of Korean Softwood Based on Convolutional Neural Networks

  • Kwon, Ohkyung;Lee, Hyung Gu;Lee, Mi-Rim;Jang, Sujin;Yang, Sang-Yun;Park, Se-Yeong;Choi, In-Gyu;Yeo, Hwanmyeong
    • Journal of the Korean Wood Science and Technology
    • /
    • v.45 no.6
    • /
    • pp.797-808
    • /
    • 2017
  • Automatic wood species identification systems have enabled fast and accurate identification of wood species outside of specialized laboratories with well-trained experts on wood species identification. Conventional automatic wood species identification systems consist of two major parts: a feature extractor and a classifier. Feature extractors require hand-engineering to obtain optimal features to quantify the content of an image. A Convolutional Neural Network (CNN), which is one of the Deep Learning methods, trained for wood species can extract intrinsic feature representations and classify them correctly. It usually outperforms classifiers built on top of extracted features with a hand-tuning process. We developed an automatic wood species identification system utilizing CNN models such as LeNet, MiniVGGNet, and their variants. A smartphone camera was used for obtaining macroscopic images of rough sawn surfaces from cross sections of woods. Five Korean softwood species (cedar, cypress, Korean pine, Korean red pine, and larch) were under classification by the CNN models. The highest and most stable CNN model was LeNet3 that is two additional layers added to the original LeNet architecture. The accuracy of species identification by LeNet3 architecture for the five Korean softwood species was 99.3%. The result showed the automatic wood species identification system is sufficiently fast and accurate as well as small to be deployed to a mobile device such as a smartphone.

Shooting sound analysis using convolutional neural networks and long short-term memory (합성곱 신경망과 장단기 메모리를 이용한 사격음 분석 기법)

  • Kang, Se Hyeok;Cho, Ji Woong
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.312-318
    • /
    • 2022
  • This paper proposes a model which classifies the type of guns and information about sound source location using deep neural network. The proposed classification model is composed of convolutional neural networks (CNN) and long short-term memory (LSTM). For training and test the model, we use the Gunshot Audio Forensic Dataset generated by the project supported by the National Institute of Justice (NIJ). The acoustic signals are transformed to Mel-Spectrogram and they are provided as learning and test data for the proposed model. The model is compared with the control model consisting of convolutional neural networks only. The proposed model shows high accuracy more than 90 %.

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

A Study on Lexicon Integrated Convolutional Neural Networks for Sentiment Analysis (감성 분석을 위한 어휘 통합 합성곱 신경망에 관한 연구)

  • Yoon, Joo-Sung;Kim, Hyeon-Cheol
    • Annual Conference of KIPS
    • /
    • 2017.04a
    • /
    • pp.916-919
    • /
    • 2017
  • 최근 딥러닝의 발달로 인해 Sentiment analysis분야에서도 다양한 기법들이 적용되고 있다. 이미지, 음성인식 분야에서 높은 성능을 보여주었던 Convolutional Neural Networks (CNN)은 최근 자연어처리 분야에서도 활발하게 연구가 진행되고 있으며 Sentiment analysis에도 효과적인 것으로 알려져 있다. 기존의 머신러닝에서는 lexicon을 이용한 기법들이 활발하게 연구되었지만 word embedding이 등장하면서 이러한 시도가 점차 줄어들게 되었다. 그러나 lexicon은 여전히 sentiment analysis에서 유용한 정보를 제공한다. 본 연구에서는 SemEval 2017 Task4에서 제공한 Twitter dataset과 다양한 lexicon corpus를 사용하여 lexicon을 CNN과 결합하였을 때 모델의 성능이 얼마큼 향상되는지에 대하여 연구하였다. 또한 word embedding과 lexicon이 미치는 영향에 대하여 분석하였다. 모델을 평가하는 metric은 positive, negative, neutral 3가지 class에 대한 macroaveraged F1 score를 사용하였다.

Automatic Volumetric Brain Tumor Segmentation using Convolutional Neural Networks

  • Yavorskyi, Vladyslav;Sull, Sanghoon
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2019.05a
    • /
    • pp.432-435
    • /
    • 2019
  • Convolutional Neural Networks (CNNs) have recently been gaining popularity in the medical image analysis field because of their image segmentation capabilities. In this paper, we present a CNN that performs automated brain tumor segmentations of sparsely annotated 3D Magnetic Resonance Imaging (MRI) scans. Our CNN is based on 3D U-net architecture, and it includes separate Dilated and Depth-wise Convolutions. It is fully-trained on the BraTS 2018 data set, and it produces more accurate results even when compared to the winners of the BraTS 2017 competition despite having a significantly smaller amount of parameters.

  • PDF

Hand Gesture Recognition with Convolution Neural Networks for Augmented Reality Cognitive Rehabilitation System Based on Leap Motion Controller (립모션 센서 기반 증강현실 인지재활 훈련시스템을 위한 합성곱신경망 손동작 인식)

  • Song, Keun San;Lee, Hyun Ju;Tae, Ki Sik
    • Journal of Biomedical Engineering Research
    • /
    • v.42 no.4
    • /
    • pp.186-192
    • /
    • 2021
  • In this paper, we evaluated prediction accuracy of Euler angle spectrograph classification method using a convolutional neural networks (CNN) for hand gesture recognition in augmented reality (AR) cognitive rehabilitation system based on Leap Motion Controller (LMC). Hand gesture recognition methods using a conventional support vector machine (SVM) show 91.3% accuracy in multiple motions. In this paper, five hand gestures ("Promise", "Bunny", "Close", "Victory", and "Thumb") are selected and measured 100 times for testing the utility of spectral classification techniques. Validation results for the five hand gestures were able to be correctly predicted 100% of the time, indicating superior recognition accuracy than those of conventional SVM methods. The hand motion recognition using CNN meant to be applied more useful to AR cognitive rehabilitation training systems based on LMC than sign language recognition using SVM.

Efficient Collecting Scheme the Crack Data via Vector based Data Augmentation and Style Transfer with Artificial Neural Networks (벡터 기반 데이터 증강과 인공신경망 기반 특징 전달을 이용한 효율적인 균열 데이터 수집 기법)

  • Yun, Ju-Young;Kim, Donghui;Kim, Jong-Hyun
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2021.07a
    • /
    • pp.667-669
    • /
    • 2021
  • 본 논문에서는 벡터 기반 데이터 증강 기법(Data augmentation)을 제안하여 학습 데이터를 구축한 뒤, 이를 합성곱 신경망(Convolutional Neural Networks, CNN)으로 실제 균열과 가까운 패턴을 표현할 수 있는 프레임워크를 제안한다. 건축물의 균열은 인명 피해를 가져오는 건물 붕괴와 낙하 사고를 비롯한 큰 사고의 원인이다. 이를 인공지능으로 해결하기 위해서는 대량의 데이터 확보가 필수적이다. 하지만, 실제 균열 이미지는 복잡한 패턴을 가지고 있을 뿐만 아니라, 위험한 상황에 노출되기 때문에 대량의 데이터를 확보하기 어렵다. 이러한 데이터베이스 구축의 문제점은 인위적으로 특정 부분에 변형을 주어 데이터양을 늘리는 탄성왜곡(Elastic distortion) 기법으로 해결할 수 있지만, 본 논문에서는 이보다 향상된 균열 패턴 결과를 CNN을 활용하여 보여준다. 탄성왜곡 기법보다 CNN을 이용했을 때, 실제 균열 패턴과 유사하게 추출된 결과를 얻을 수 있었고, 일반적으로 사용되는 픽셀 기반 데이터가 아닌 벡터 기반으로 데이터 증강을 설계함으로써 균열의 변화량 측면에서 우수함을 보였다. 본 논문에서는 적은 개수의 균열 데이터를 입력으로 사용했음에도 불구하고 균열의 방향 및 패턴을 다양하게 생성하여 쉽게 균열 데이터베이스를 구축할 수 있었다. 이는 장기적으로 구조물의 안정성 평가에 이바지하여 안전사고에 대한 불안감에서 벗어나 더욱 안전하고 쾌적한 주거 환경을 조성할 것으로 기대된다.

  • PDF