• Title/Summary/Keyword: RNN (recurrent neural networks)

Search Result 106, Processing Time 0.023 seconds

A Study on Performance Improvement of Recurrent Neural Networks Algorithm using Word Group Expansion Technique (단어그룹 확장 기법을 활용한 순환신경망 알고리즘 성능개선 연구)

  • Park, Dae Seung;Sung, Yeol Woo;Kim, Cheong Ghil
    • Journal of Industrial Convergence
    • /
    • v.20 no.4
    • /
    • pp.23-30
    • /
    • 2022
  • Recently, with the development of artificial intelligence (AI) and deep learning, the importance of conversational artificial intelligence chatbots is being highlighted. In addition, chatbot research is being conducted in various fields. To build a chatbot, it is developed using an open source platform or a commercial platform for ease of development. These chatbot platforms mainly use RNN and application algorithms. The RNN algorithm has the advantages of fast learning speed, ease of monitoring and verification, and good inference performance. In this paper, a method for improving the inference performance of RNNs and applied algorithms was studied. The proposed method used the word group expansion learning technique of key words for each sentence when RNN and applied algorithm were applied. As a result of this study, the RNN, GRU, and LSTM three algorithms with a cyclic structure achieved a minimum of 0.37% and a maximum of 1.25% inference performance improvement. The research results obtained through this study can accelerate the adoption of artificial intelligence chatbots in related industries. In addition, it can contribute to utilizing various RNN application algorithms. In future research, it will be necessary to study the effect of various activation functions on the performance improvement of artificial neural network algorithms.

S2-Net: Machine reading comprehension with SRU-based self-matching networks

  • Park, Cheoneum;Lee, Changki;Hong, Lynn;Hwang, Yigyu;Yoo, Taejoon;Jang, Jaeyong;Hong, Yunki;Bae, Kyung-Hoon;Kim, Hyun-Ki
    • ETRI Journal
    • /
    • v.41 no.3
    • /
    • pp.371-382
    • /
    • 2019
  • Machine reading comprehension is the task of understanding a given context and finding the correct response in that context. A simple recurrent unit (SRU) is a model that solves the vanishing gradient problem in a recurrent neural network (RNN) using a neural gate, such as a gated recurrent unit (GRU) and long short-term memory (LSTM); moreover, it removes the previous hidden state from the input gate to improve the speed compared to GRU and LSTM. A self-matching network, used in R-Net, can have a similar effect to coreference resolution because the self-matching network can obtain context information of a similar meaning by calculating the attention weight for its own RNN sequence. In this paper, we construct a dataset for Korean machine reading comprehension and propose an $S^2-Net$ model that adds a self-matching layer to an encoder RNN using multilayer SRU. The experimental results show that the proposed $S^2-Net$ model has performance of single 68.82% EM and 81.25% F1, and ensemble 70.81% EM, 82.48% F1 in the Korean machine reading comprehension test dataset, and has single 71.30% EM and 80.37% F1 and ensemble 73.29% EM and 81.54% F1 performance in the SQuAD dev dataset.

Initial Small Data Reveal Rumor Traits via Recurrent Neural Networks (초기 소량 데이터와 RNN을 활용한 루머 전파 추적 기법)

  • Kwon, Sejeong;Cha, Meeyoung
    • Journal of KIISE
    • /
    • v.44 no.7
    • /
    • pp.680-685
    • /
    • 2017
  • The emergence of online media and their data has enabled data-driven methods to solve challenging and complex tasks such as rumor classification problems. Recently, deep learning based models have been shown as one of the fastest and the most accurate algorithms to solve such problems. These new models, however, either rely on complete data or several days-worth of data, limiting their applicability in real time. In this study, we go beyond this limit and test the possibility of super early rumor detection via recurrent neural networks (RNNs). Our model takes in social media streams as time series input, along with basic meta-information about the rumongers including the follower count and the psycholinguistic traits of rumor content itself. Based on analyzing millions of social media posts on 498 real rumors and 494 non-rumor events, our RNN-based model detected rumors with only 30 initial posts (i.e., within a few hours of rumor circulation) with remarkable F1 score of 0.74. This finding widens the scope of new possibilities for building a fast and efficient rumor detection system.

Nonlinear Prediction using Gamma Multilayered Neural Network (Gamma 다층 신경망을 이용한 비선형 적응예측)

  • Kim Jong-In;Go Il-Hwan;Choi Han-Go
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.7 no.2
    • /
    • pp.53-59
    • /
    • 2006
  • Dynamic neural networks have been applied to diverse fields requiring temporal signal processing such as system identification and signal prediction. This paper proposes the gamma neural network(GAM), which uses gamma memory kernel in the hidden layer of feedforward multilayered network, to improve dynamics of networks and then describes nonlinear adaptive prediction using the proposed network as an adaptive filter. The proposed network is evaluated in nonlinear signal prediction and compared with feedforword(FNN) and recurrent neural networks(RNN) for the relative comparison of prediction performance. Simulation results show that the GAM network performs better with respect to the convergence speed and prediction accuracy, indicating that it can be a more effective prediction model than conventional multilayered networks in nonlinear prediction for nonstationary signals.

  • PDF

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

Prediction of pollution loads in the Geum River upstream using the recurrent neural network algorithm

  • Lim, Heesung;An, Hyunuk;Kim, Haedo;Lee, Jeaju
    • Korean Journal of Agricultural Science
    • /
    • v.46 no.1
    • /
    • pp.67-78
    • /
    • 2019
  • The purpose of this study was to predict the water quality using the RNN (recurrent neutral network) and LSTM (long short-term memory). These are advanced forms of machine learning algorithms that are better suited for time series learning compared to artificial neural networks; however, they have not been investigated before for water quality prediction. Three water quality indexes, the BOD (biochemical oxygen demand), COD (chemical oxygen demand), and SS (suspended solids) are predicted by the RNN and LSTM. TensorFlow, an open source library developed by Google, was used to implement the machine learning algorithm. The Okcheon observation point in the Geum River basin in the Republic of Korea was selected as the target point for the prediction of the water quality. Ten years of daily observed meteorological (daily temperature and daily wind speed) and hydrological (water level and flow discharge) data were used as the inputs, and irregularly observed water quality (BOD, COD, and SS) data were used as the learning materials. The irregularly observed water quality data were converted into daily data with the linear interpolation method. The water quality after one day was predicted by the machine learning algorithm, and it was found that a water quality prediction is possible with high accuracy compared to existing physical modeling results in the prediction of the BOD, COD, and SS, which are very non-linear. The sequence length and iteration were changed to compare the performances of the algorithms.

Performance of Exercise Posture Correction System Based on Deep Learning (딥러닝 기반 운동 자세 교정 시스템의 성능)

  • Hwang, Byungsun;Kim, Jeongho;Lee, Ye-Ram;Kyeong, Chanuk;Seon, Joonho;Sun, Young-Ghyu;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.22 no.5
    • /
    • pp.177-183
    • /
    • 2022
  • Recently, interesting of home training is getting bigger due to COVID-19. Accordingly, research on applying HAR(human activity recognition) technology to home training has been conducted. However, existing paper of HAR proposed static activity instead of dynamic activity. In this paper, the deep learning model where dynamic exercise posture can be analyzed and the accuracy of the user's exercise posture can be shown is proposed. Fitness images of AI-hub are analyzed by blaze pose. The experiment is compared with three types of deep learning model: RNN(recurrent neural network), LSTM(long short-term memory), CNN(convolution neural network). In simulation results, it was shown that the f1-score of RNN, LSTM and CNN is 0.49, 0.87 and 0.98, respectively. It was confirmed that CNN is more suitable for human activity recognition than other models from simulation results. More exercise postures can be analyzed using a variety learning data.

Intelligent Control of Nuclear Power Plant Steam Generator Using Neural Networks (신경회로망을 이용한 원자력발전소 증기발생기의 지능제어)

  • Kim, Sung-Soo;Lee, Jae-Gi;Choi, Jin-Young
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.6 no.2
    • /
    • pp.127-137
    • /
    • 2000
  • This paper presents a novel neural based controller which controls the water level of the nuclear power plant steam generator. The controller consists of a model reference feedback linearization controller and a PI controller for stabilizing the feedback linearization controller. The feedback linearization controller consists of a neural network model and an inversing module which uses the neural network model for computing the control input to the steam generator. We chose Piecewise Linearly Trained Network(PLTN) and Recurrent Neural Netwrok(RNN) for an approximator of the plant and used these approximators in calculating the input from the feedback linearization controller. Combining the above two controllers gives a result of better performance than the case which uses only a PI controller Each control result of PLTN and RNN is given.

  • PDF

EEG Dimensional Reduction with Stack AutoEncoder for Emotional Recognition using LSTM/RNN (LSTM/RNN을 사용한 감정인식을 위한 스택 오토 인코더로 EEG 차원 감소)

  • Aliyu, Ibrahim;Lim, Chang-Gyoon
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.4
    • /
    • pp.717-724
    • /
    • 2020
  • Due to the important role played by emotion in human interaction, affective computing is dedicated in trying to understand and regulate emotion through human-aware artificial intelligence. By understanding, emotion mental diseases such as depression, autism, attention deficit hyperactivity disorder, and game addiction will be better managed as they are all associated with emotion. Various studies for emotion recognition have been conducted to solve these problems. In applying machine learning for the emotion recognition, the efforts to reduce the complexity of the algorithm and improve the accuracy are required. In this paper, we investigate emotion Electroencephalogram (EEG) feature reduction and classification using Stack AutoEncoder (SAE) and Long-Short-Term-Memory/Recurrent Neural Networks (LSTM/RNN) classification respectively. The proposed method reduced the complexity of the model and significantly enhance the performance of the classifiers.

System Identification Using Gamma Multilayer Neural Network (감마 다층 신경망을 이용한 시스템 식별)

  • Go, Il-Whan;Won, Sang-Chul;Choi, Han-Go
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.9 no.3
    • /
    • pp.238-244
    • /
    • 2008
  • Dynamic neural networks have been applied to diverse fields requiring temporal signal processing. This paper presents gamma neural network(GAM) to improve the dynamics of multilayer network. The GAM network uses the gamma memory kernel in the hidden layer of feedforword multilayer network. The GAM network is evaluated in linear and nonlinear system identification, and compared with feedforword(FNN) and recurrent neural networks(RNN) for the relative comparison of its performance. Experimental results show that the GAM network performs better with respect to the convergence and accuracy, indicating that it can be a more effective network than conventional multilayer networks in system identification.

  • PDF