• Title/Summary/Keyword: GRU Model

Search Result 123, Processing Time 0.028 seconds

Document Classification using Recurrent Neural Network with Word Sense and Contexts (단어의 의미와 문맥을 고려한 순환신경망 기반의 문서 분류)

  • Joo, Jong-Min;Kim, Nam-Hun;Yang, Hyung-Jeong;Park, Hyuck-Ro
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.7
    • /
    • pp.259-266
    • /
    • 2018
  • In this paper, we propose a method to classify a document using a Recurrent Neural Network by extracting features considering word sense and contexts. Word2vec method is adopted to include the order and meaning of the words expressing the word in the document as a vector. Doc2vec is applied for considering the context to extract the feature of the document. RNN classifier, which includes the output of the previous node as the input of the next node, is used as the document classification method. RNN classifier presents good performance for document classification because it is suitable for sequence data among neural network classifiers. We applied GRU (Gated Recurrent Unit) model which solves the vanishing gradient problem of RNN. It also reduces computation speed. We used one Hangul document set and two English document sets for the experiments and GRU based document classifier improves performance by about 3.5% compared to CNN based document classifier.

Detecting Anomalies in Time-Series Data using Unsupervised Learning and Analysis on Infrequent Signatures

  • Bian, Xingchao
    • Journal of IKEEE
    • /
    • v.24 no.4
    • /
    • pp.1011-1016
    • /
    • 2020
  • We propose a framework called Stacked Gated Recurrent Unit - Infrequent Residual Analysis (SG-IRA) that detects anomalies in time-series data that can be trained on streams of raw sensor data without any pre-labeled dataset. To enable such unsupervised learning, SG-IRA includes an estimation model that uses a stacked Gated Recurrent Unit (GRU) structure and an analysis method that detects anomalies based on the difference between the estimated value and the actual measurement (residual). SG-IRA's residual analysis method dynamically adapts the detection threshold from the population using frequency analysis, unlike the baseline model that relies on a constant threshold. In this paper, SG-IRA is evaluated using the industrial control systems (ICS) datasets. SG-IRA improves the detection performance (F1 score) by 5.9% compared to the baseline model.

Sentiment Analysis Using Deep Learning Model based on Phoneme-level Korean (한글 음소 단위 딥러닝 모형을 이용한 감성분석)

  • Lee, Jae Jun;Kwon, Suhn Beom;Ahn, Sung Mahn
    • Journal of Information Technology Services
    • /
    • v.17 no.1
    • /
    • pp.79-89
    • /
    • 2018
  • Sentiment analysis is a technique of text mining that extracts feelings of the person who wrote the sentence like movie review. The preliminary researches of sentiment analysis identify sentiments by using the dictionary which contains negative and positive words collected in advance. As researches on deep learning are actively carried out, sentiment analysis using deep learning model with morpheme or word unit has been done. However, this model has disadvantages in that the word dictionary varies according to the domain and the number of morphemes or words gets relatively larger than that of phonemes. Therefore, the size of the dictionary becomes large and the complexity of the model increases accordingly. We construct a sentiment analysis model using recurrent neural network by dividing input data into phoneme-level which is smaller than morpheme-level. To verify the performance, we use 30,000 movie reviews from the Korean biggest portal, Naver. Morpheme-level sentiment analysis model is also implemented and compared. As a result, the phoneme-level sentiment analysis model is superior to that of the morpheme-level, and in particular, the phoneme-level model using LSTM performs better than that of using GRU model. It is expected that Korean text processing based on a phoneme-level model can be applied to various text mining and language models.

A Comparative Study of Machine Learning Algorithms Based on Tensorflow for Data Prediction (데이터 예측을 위한 텐서플로우 기반 기계학습 알고리즘 비교 연구)

  • Abbas, Qalab E.;Jang, Sung-Bong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.10 no.3
    • /
    • pp.71-80
    • /
    • 2021
  • The selection of an appropriate neural network algorithm is an important step for accurate data prediction in machine learning. Many algorithms based on basic artificial neural networks have been devised to efficiently predict future data. These networks include deep neural networks (DNNs), recurrent neural networks (RNNs), long short-term memory (LSTM) networks, and gated recurrent unit (GRU) neural networks. Developers face difficulties when choosing among these networks because sufficient information on their performance is unavailable. To alleviate this difficulty, we evaluated the performance of each algorithm by comparing their errors and processing times. Each neural network model was trained using a tax dataset, and the trained model was used for data prediction to compare accuracies among the various algorithms. Furthermore, the effects of activation functions and various optimizers on the performance of the models were analyzed The experimental results show that the GRU and LSTM algorithms yields the lowest prediction error with an average RMSE of 0.12 and an average R2 score of 0.78 and 0.75 respectively, and the basic DNN model achieves the lowest processing time but highest average RMSE of 0.163. Furthermore, the Adam optimizer yields the best performance (with DNN, GRU, and LSTM) in terms of error and the worst performance in terms of processing time. The findings of this study are thus expected to be useful for scientists and developers.

The Prediction of Durability Performance for Chloride Ingress in Fly Ash Concrete by Artificial Neural Network Algorithm (인공 신경망 알고리즘을 활용한 플라이애시 콘크리트의 염해 내구성능 예측)

  • Kwon, Seung-Jun;Yoon, Yong-Sik
    • Journal of the Korea institute for structural maintenance and inspection
    • /
    • v.26 no.5
    • /
    • pp.127-134
    • /
    • 2022
  • In this study, RCPTs (Rapid Chloride Penetration Test) were performed for fly ash concrete with curing age of 4 ~ 6 years. The concrete mixtures were prepared with 3 levels of water to binder ratio (0.37, 0.42, and 0.47) and 2 levels of substitution ratio of fly ash (0 and 30%), and the improved passed charges of chloride ion behavior were quantitatively analyzed. Additionally, the results were trained through the univariate time series models consisted of GRU (Gated Recurrent Unit) algorithm and those from the models were evaluated. As the result of the RCPT, fly ash concrete showed the reduced passed charges with period and an more improved resistance to chloride penetration than OPC concrete. At the final evaluation period (6 years), fly ash concrete showed 'Very low' grade in all W/B (water to binder) ratio, however OPC concrete showed 'Moderate' grade in the condition with the highest W/B ratio (0.47). The adopted algorithm of GRU for this study can analyze time series data and has the advantage like operation efficiency. The deep learning model with 4 hidden layers was designed, and it provided a reasonable prediction results of passed charge. The deep learning model from this study has a limitation of single consideration of a univariate time series characteristic, but it is in the developing process of providing various characteristics of concrete like strength and diffusion coefficient through additional studies.

Using machine learning to forecast and assess the uncertainty in the response of a typical PWR undergoing a steam generator tube rupture accident

  • Tran Canh Hai Nguyen ;Aya Diab
    • Nuclear Engineering and Technology
    • /
    • v.55 no.9
    • /
    • pp.3423-3440
    • /
    • 2023
  • In this work, a multivariate time-series machine learning meta-model is developed to predict the transient response of a typical nuclear power plant (NPP) undergoing a steam generator tube rupture (SGTR). The model employs Recurrent Neural Networks (RNNs), including the Long Short-Term Memory (LSTM), Gated Recurrent Unit (GRU), and a hybrid CNN-LSTM model. To address the uncertainty inherent in such predictions, a Bayesian Neural Network (BNN) was implemented. The models were trained using a database generated by the Best Estimate Plus Uncertainty (BEPU) methodology; coupling the thermal hydraulics code, RELAP5/SCDAP/MOD3.4 to the statistical tool, DAKOTA, to predict the variation in system response under various operational and phenomenological uncertainties. The RNN models successfully captures the underlying characteristics of the data with reasonable accuracy, and the BNN-LSTM approach offers an additional layer of insight into the level of uncertainty associated with the predictions. The results demonstrate that LSTM outperforms GRU, while the hybrid CNN-LSTM model is computationally the most efficient. This study aims to gain a better understanding of the capabilities and limitations of machine learning models in the context of nuclear safety. By expanding the application of ML models to more severe accident scenarios, where operators are under extreme stress and prone to errors, ML models can provide valuable support and act as expert systems to assist in decision-making while minimizing the chances of human error.

Recurrent Neural Network Model for Predicting Tight Oil Productivity Using Type Curve Parameters for Each Cluster (군집 별 표준곡선 매개변수를 이용한 치밀오일 생산성 예측 순환신경망 모델)

  • Han, Dong-kwon;Kim, Min-soo;Kwon, Sun-il
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2021.10a
    • /
    • pp.297-299
    • /
    • 2021
  • Predicting future productivity of tight oil is an important task for analyzing residual oil recovery and reservoir behavior. In general, productivity prediction is made using the decline curve analysis(DCA). In this study, we intend to propose an effective model for predicting future production using deep learning-based recurrent neural networks(RNN), LSTM, and GRU algorithms. As input variables, the main parameters are oil, gas, water, which are calculated during the production of tight oil, and the type curve calculated through various cluster analyzes. the output variable is the monthly oil production. Existing empirical models, the DCA and RNN models, were compared, and an optimal model was derived through hyperparameter tuning to improve the predictive performance of the model.

  • PDF

Comparison of System Call Sequence Embedding Approaches for Anomaly Detection (이상 탐지를 위한 시스템콜 시퀀스 임베딩 접근 방식 비교)

  • Lee, Keun-Seop;Park, Kyungseon;Kim, Kangseok
    • Journal of Convergence for Information Technology
    • /
    • v.12 no.2
    • /
    • pp.47-53
    • /
    • 2022
  • Recently, with the change of the intelligent security paradigm, study to apply various information generated from various information security systems to AI-based anomaly detection is increasing. Therefore, in this study, in order to convert log-like time series data into a vector, which is a numerical feature, the CBOW and Skip-gram inference methods of deep learning-based Word2Vec model and statistical method based on the coincidence frequency were used to transform the published ADFA system call data. In relation to this, an experiment was carried out through conversion into various embedding vectors considering the dimension of vector, the length of sequence, and the window size. In addition, the performance of the embedding methods used as well as the detection performance were compared and evaluated through GRU-based anomaly detection model using vectors generated by the embedding model as an input. Compared to the statistical model, it was confirmed that the Skip-gram maintains more stable performance without biasing a specific window size or sequence length, and is more effective in making each event of sequence data into an embedding vector.

A Study on Recognition of Citation Metadata using Bidirectional GRU-CRF Model based on Pre-trained Language Model (사전학습 된 언어 모델 기반의 양방향 게이트 순환 유닛 모델과 조건부 랜덤 필드 모델을 이용한 참고문헌 메타데이터 인식 연구)

  • Ji, Seon-yeong;Choi, Sung-pil
    • Journal of the Korean Society for information Management
    • /
    • v.38 no.1
    • /
    • pp.221-242
    • /
    • 2021
  • This study applied reference metadata recognition using bidirectional GRU-CRF model based on pre-trained language model. The experimental group consists of 161,315 references extracted by 53,562 academic documents in PDF format collected from 40 journals published in 2018 based on rules. In order to construct an experiment set. This study was conducted to automatically extract the references from academic literature in PDF format. Through this study, the language model with the highest performance was identified, and additional experiments were conducted on the model to compare the recognition performance according to the size of the training set. Finally, the performance of each metadata was confirmed.

Sentiment Analysis to Evaluate Different Deep Learning Approaches

  • Sheikh Muhammad Saqib ;Tariq Naeem
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.11
    • /
    • pp.83-92
    • /
    • 2023
  • The majority of product users rely on the reviews that are posted on the appropriate website. Both users and the product's manufacturer could benefit from these reviews. Daily, thousands of reviews are submitted; how is it possible to read them all? Sentiment analysis has become a critical field of research as posting reviews become more and more common. Machine learning techniques that are supervised, unsupervised, and semi-supervised have worked very hard to harvest this data. The complicated and technological area of feature engineering falls within machine learning. Using deep learning, this tedious process may be completed automatically. Numerous studies have been conducted on deep learning models like LSTM, CNN, RNN, and GRU. Each model has employed a certain type of data, such as CNN for pictures and LSTM for language translation, etc. According to experimental results utilizing a publicly accessible dataset with reviews for all of the models, both positive and negative, and CNN, the best model for the dataset was identified in comparison to the other models, with an accuracy rate of 81%.