• Title/Summary/Keyword: long short-time memory neural network

Search Result 122, Processing Time 0.023 seconds

Comparison of Power Consumption Prediction Scheme Based on Artificial Intelligence (인공지능 기반 전력량예측 기법의 비교)

  • Lee, Dong-Gu;Sun, Young-Ghyu;Kim, Soo-Hyun;Sim, Issac;Hwang, Yu-Min;Kim, Jin-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.4
    • /
    • pp.161-167
    • /
    • 2019
  • Recently, demand forecasting techniques have been actively studied due to interest in stable power supply with surging power demand, and increase in spread of smart meters that enable real-time power measurement. In this study, we proceeded the deep learning prediction model experiments which learns actual measured power usage data of home and outputs the forecasting result. And we proceeded pre-processing with moving average method. The predicted value made by the model is evaluated with the actual measured data. Through this forecasting, it is possible to lower the power supply reserve ratio and reduce the waste of the unused power. In this paper, we conducted experiments on three types of networks: Multi Layer Perceptron (MLP), Recurrent Neural Network (RNN), and Long Short Term Memory (LSTM) and we evaluate the results of each scheme. Evaluation is conducted with following method: MSE(Mean Squared Error) method and MAE(Mean Absolute Error).

Hand Expression Recognition for Virtual Blackboard (가상 칠판을 위한 손 표현 인식)

  • Heo, Gyeongyong;Kim, Myungja;Song, Bok Deuk;Shin, Bumjoo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1770-1776
    • /
    • 2021
  • For hand expression recognition, hand pose recognition based on the static shape of the hand and hand gesture recognition based on hand movement are used together. In this paper, we proposed a hand expression recognition method that recognizes symbols based on the trajectory of a hand movement on a virtual blackboard. In order to recognize a sign drawn by hand on a virtual blackboard, not only a method of recognizing a sign from a hand movement, but also hand pose recognition for finding the start and end of data input is also required. In this paper, MediaPipe was used to recognize hand pose, and LSTM(Long Short Term Memory), a type of recurrent neural network, was used to recognize hand gesture from time series data. To verify the effectiveness of the proposed method, it was applied to the recognition of numbers written on a virtual blackboard, and a recognition rate of about 94% was obtained.

LSTM based Network Traffic Volume Prediction (LSTM 기반의 네트워크 트래픽 용량 예측)

  • Nguyen, Giang-Truong;Nguyen, Van-Quyet;Nguyen, Huu-Duy;Kim, Kyungbaek
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.362-364
    • /
    • 2018
  • Predicting network traffic volume has become a popular topic recently due to its support in many situations such as detecting abnormal network activities and provisioning network services. Especially, predicting the volume of the next upcoming traffic from the series of observed recent traffic volume is an interesting and challenging problem. In past, various techniques are researched by using time series forecasting methods such as moving averaging and exponential smoothing. In this paper, we propose a long short-term memory neural network (LSTM) based network traffic volume prediction method. The proposed method employs the changing rate of observed traffic volume, the corresponding time window index, and a seasonality factor indicating the changing trend as input features, and predicts the upcoming network traffic. The experiment results with real datasets proves that our proposed method works better than other time series forecasting methods in predicting upcoming network traffic.

AI based complex sensor application study for energy management in WTP (정수장에서의 에너지 관리를 위한 AI 기반 복합센서 적용 연구)

  • Hong, Sung-Taek;An, Sang-Byung;Kim, Kuk-Il;Sung, Min-Seok
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2022.05a
    • /
    • pp.322-323
    • /
    • 2022
  • The most necessary thing for the optimal operation of a water purification plant is to accurately predict the pattern and amount of tap water used by consumers. The required amount of tap water should be delivered to the drain using a pump and stored, and the required flow rate should be supplied in a timely manner using the minimum amount of electrical energy. The short-term demand forecasting required from the point of view of energy optimization operation among water purification plant volume predictions has been made in consideration of seasons, major periods, and regional characteristics using time series analysis, regression analysis, and neural network algorithms. In this paper, we analyzed energy management methods through AI-based complex sensor applicability analysis such as LSTM (Long Short-Term Memory) and GRU (Gated Recurrent Units), which are types of cyclic neural networks.

  • PDF

Deep Learning Architectures and Applications (딥러닝의 모형과 응용사례)

  • Ahn, SungMahn
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.2
    • /
    • pp.127-142
    • /
    • 2016
  • Deep learning model is a kind of neural networks that allows multiple hidden layers. There are various deep learning architectures such as convolutional neural networks, deep belief networks and recurrent neural networks. Those have been applied to fields like computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics where they have been shown to produce state-of-the-art results on various tasks. Among those architectures, convolutional neural networks and recurrent neural networks are classified as the supervised learning model. And in recent years, those supervised learning models have gained more popularity than unsupervised learning models such as deep belief networks, because supervised learning models have shown fashionable applications in such fields mentioned above. Deep learning models can be trained with backpropagation algorithm. Backpropagation is an abbreviation for "backward propagation of errors" and a common method of training artificial neural networks used in conjunction with an optimization method such as gradient descent. The method calculates the gradient of an error function with respect to all the weights in the network. The gradient is fed to the optimization method which in turn uses it to update the weights, in an attempt to minimize the error function. Convolutional neural networks use a special architecture which is particularly well-adapted to classify images. Using this architecture makes convolutional networks fast to train. This, in turn, helps us train deep, muti-layer networks, which are very good at classifying images. These days, deep convolutional networks are used in most neural networks for image recognition. Convolutional neural networks use three basic ideas: local receptive fields, shared weights, and pooling. By local receptive fields, we mean that each neuron in the first(or any) hidden layer will be connected to a small region of the input(or previous layer's) neurons. Shared weights mean that we're going to use the same weights and bias for each of the local receptive field. This means that all the neurons in the hidden layer detect exactly the same feature, just at different locations in the input image. In addition to the convolutional layers just described, convolutional neural networks also contain pooling layers. Pooling layers are usually used immediately after convolutional layers. What the pooling layers do is to simplify the information in the output from the convolutional layer. Recent convolutional network architectures have 10 to 20 hidden layers and billions of connections between units. Training deep learning networks has taken weeks several years ago, but thanks to progress in GPU and algorithm enhancement, training time has reduced to several hours. Neural networks with time-varying behavior are known as recurrent neural networks or RNNs. A recurrent neural network is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior. Unlike feedforward neural networks, RNNs can use their internal memory to process arbitrary sequences of inputs. Early RNN models turned out to be very difficult to train, harder even than deep feedforward networks. The reason is the unstable gradient problem such as vanishing gradient and exploding gradient. The gradient can get smaller and smaller as it is propagated back through layers. This makes learning in early layers extremely slow. The problem actually gets worse in RNNs, since gradients aren't just propagated backward through layers, they're propagated backward through time. If the network runs for a long time, that can make the gradient extremely unstable and hard to learn from. It has been possible to incorporate an idea known as long short-term memory units (LSTMs) into RNNs. LSTMs make it much easier to get good results when training RNNs, and many recent papers make use of LSTMs or related ideas.

Social Media based Real-time Event Detection by using Deep Learning Methods

  • Nguyen, Van Quan;Yang, Hyung-Jeong;Kim, Young-chul;Kim, Soo-hyung;Kim, Kyungbaek
    • Smart Media Journal
    • /
    • v.6 no.3
    • /
    • pp.41-48
    • /
    • 2017
  • Event detection using social media has been widespread since social network services have been an active communication channel for connecting with others, diffusing news message. Especially, the real-time characteristic of social media has created the opportunity for supporting for real-time applications/systems. Social network such as Twitter is the potential data source to explore useful information by mining messages posted by the user community. This paper proposed a novel system for temporal event detection by analyzing social data. As a result, this information can be used by first responders, decision makers, or news agents to gain insight of the situation. The proposed approach takes advantages of deep learning methods that play core techniques on the main tasks including informative data identifying from a noisy environment and temporal event detection. The former is the responsibility of Convolutional Neural Network model trained from labeled Twitter data. The latter is for event detection supported by Recurrent Neural Network module. We demonstrated our approach and experimental results on the case study of earthquake situations. Our system is more adaptive than other systems used traditional methods since deep learning enables to extract the features of data without spending lots of time constructing feature by hand. This benefit makes our approach adaptive to extend to a new context of practice. Moreover, the proposed system promised to respond to acceptable delay within several minutes that will helpful mean for supporting news channel agents or belief plan in case of disaster events.

Prediction Model of User Physical Activity using Data Characteristics-based Long Short-term Memory Recurrent Neural Networks

  • Kim, Joo-Chang;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2060-2077
    • /
    • 2019
  • Recently, mobile healthcare services have attracted significant attention because of the emerging development and supply of diverse wearable devices. Smartwatches and health bands are the most common type of mobile-based wearable devices and their market size is increasing considerably. However, simple value comparisons based on accumulated data have revealed certain problems, such as the standardized nature of health management and the lack of personalized health management service models. The convergence of information technology (IT) and biotechnology (BT) has shifted the medical paradigm from continuous health management and disease prevention to the development of a system that can be used to provide ground-based medical services regardless of the user's location. Moreover, the IT-BT convergence has necessitated the development of lifestyle improvement models and services that utilize big data analysis and machine learning to provide mobile healthcare-based personal health management and disease prevention information. Users' health data, which are specific as they change over time, are collected by different means according to the users' lifestyle and surrounding circumstances. In this paper, we propose a prediction model of user physical activity that uses data characteristics-based long short-term memory (DC-LSTM) recurrent neural networks (RNNs). To provide personalized services, the characteristics and surrounding circumstances of data collectable from mobile host devices were considered in the selection of variables for the model. The data characteristics considered were ease of collection, which represents whether or not variables are collectable, and frequency of occurrence, which represents whether or not changes made to input values constitute significant variables in terms of activity. The variables selected for providing personalized services were activity, weather, temperature, mean daily temperature, humidity, UV, fine dust, asthma and lung disease probability index, skin disease probability index, cadence, travel distance, mean heart rate, and sleep hours. The selected variables were classified according to the data characteristics. To predict activity, an LSTM RNN was built that uses the classified variables as input data and learns the dynamic characteristics of time series data. LSTM RNNs resolve the vanishing gradient problem that occurs in existing RNNs. They are classified into three different types according to data characteristics and constructed through connections among the LSTMs. The constructed neural network learns training data and predicts user activity. To evaluate the proposed model, the root mean square error (RMSE) was used in the performance evaluation of the user physical activity prediction method for which an autoregressive integrated moving average (ARIMA) model, a convolutional neural network (CNN), and an RNN were used. The results show that the proposed DC-LSTM RNN method yields an excellent mean RMSE value of 0.616. The proposed method is used for predicting significant activity considering the surrounding circumstances and user status utilizing the existing standardized activity prediction services. It can also be used to predict user physical activity and provide personalized healthcare based on the data collectable from mobile host devices.

Two-Dimensional Attention-Based LSTM Model for Stock Index Prediction

  • Yu, Yeonguk;Kim, Yoon-Joong
    • Journal of Information Processing Systems
    • /
    • v.15 no.5
    • /
    • pp.1231-1242
    • /
    • 2019
  • This paper presents a two-dimensional attention-based long short-memory (2D-ALSTM) model for stock index prediction, incorporating input attention and temporal attention mechanisms for weighting of important stocks and important time steps, respectively. The proposed model is designed to overcome the long-term dependency, stock selection, and stock volatility delay problems that negatively affect existing models. The 2D-ALSTM model is validated in a comparative experiment involving the two attention-based models multi-input LSTM (MI-LSTM) and dual-stage attention-based recurrent neural network (DARNN), with real stock data being used for training and evaluation. The model achieves superior performance compared to MI-LSTM and DARNN for stock index prediction on a KOSPI100 dataset.

Industrial Process Monitoring and Fault Diagnosis Based on Temporal Attention Augmented Deep Network

  • Mu, Ke;Luo, Lin;Wang, Qiao;Mao, Fushun
    • Journal of Information Processing Systems
    • /
    • v.17 no.2
    • /
    • pp.242-252
    • /
    • 2021
  • Following the intuition that the local information in time instances is hardly incorporated into the posterior sequence in long short-term memory (LSTM), this paper proposes an attention augmented mechanism for fault diagnosis of the complex chemical process data. Unlike conventional fault diagnosis and classification methods, an attention mechanism layer architecture is introduced to detect and focus on local temporal information. The augmented deep network results preserve each local instance's importance and contribution and allow the interpretable feature representation and classification simultaneously. The comprehensive comparative analyses demonstrate that the developed model has a high-quality fault classification rate of 95.49%, on average. The results are comparable to those obtained using various other techniques for the Tennessee Eastman benchmark process.

Data abnormal detection using bidirectional long-short neural network combined with artificial experience

  • Yang, Kang;Jiang, Huachen;Ding, Youliang;Wang, Manya;Wan, Chunfeng
    • Smart Structures and Systems
    • /
    • v.29 no.1
    • /
    • pp.117-127
    • /
    • 2022
  • Data anomalies seriously threaten the reliability of the bridge structural health monitoring system and may trigger system misjudgment. To overcome the above problem, an efficient and accurate data anomaly detection method is desiderated. Traditional anomaly detection methods extract various abnormal features as the key indicators to identify data anomalies. Then set thresholds artificially for various features to identify specific anomalies, which is the artificial experience method. However, limited by the poor generalization ability among sensors, this method often leads to high labor costs. Another approach to anomaly detection is a data-driven approach based on machine learning methods. Among these, the bidirectional long-short memory neural network (BiLSTM), as an effective classification method, excels at finding complex relationships in multivariate time series data. However, training unprocessed original signals often leads to low computation efficiency and poor convergence, for lacking appropriate feature selection. Therefore, this article combines the advantages of the two methods by proposing a deep learning method with manual experience statistical features fed into it. Experimental comparative studies illustrate that the BiLSTM model with appropriate feature input has an accuracy rate of over 87-94%. Meanwhile, this paper provides basic principles of data cleaning and discusses the typical features of various anomalies. Furthermore, the optimization strategies of the feature space selection based on artificial experience are also highlighted.