• Title/Summary/Keyword: Neural Network Language Model

Search Result 168, Processing Time 0.023 seconds

A Study on Word Sense Disambiguation Using Bidirectional Recurrent Neural Network for Korean Language

  • Min, Jihong;Jeon, Joon-Woo;Song, Kwang-Ho;Kim, Yoo-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.41-49
    • /
    • 2017
  • Word sense disambiguation(WSD) that determines the exact meaning of homonym which can be used in different meanings even in one form is very important to understand the semantical meaning of text document. Many recent researches on WSD have widely used NNLM(Neural Network Language Model) in which neural network is used to represent a document into vectors and to analyze its semantics. Among the previous WSD researches using NNLM, RNN(Recurrent Neural Network) model has better performance than other models because RNN model can reflect the occurrence order of words in addition to the word appearance information in a document. However, since RNN model uses only the forward order of word occurrences in a document, it is not able to reflect natural language's characteristics that later words can affect the meanings of the preceding words. In this paper, we propose a WSD scheme using Bidirectional RNN that can reflect not only the forward order but also the backward order of word occurrences in a document. From the experiments, the accuracy of the proposed model is higher than that of previous method using RNN. Hence, it is confirmed that bidirectional order information of word occurrences is useful for WSD in Korean language.

Input Dimension Reduction based on Continuous Word Vector for Deep Neural Network Language Model (Deep Neural Network 언어모델을 위한 Continuous Word Vector 기반의 입력 차원 감소)

  • Kim, Kwang-Ho;Lee, Donghyun;Lim, Minkyu;Kim, Ji-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.3-8
    • /
    • 2015
  • In this paper, we investigate an input dimension reduction method using continuous word vector in deep neural network language model. In the proposed method, continuous word vectors were generated by using Google's Word2Vec from a large training corpus to satisfy distributional hypothesis. 1-of-${\left|V\right|}$ coding discrete word vectors were replaced with their corresponding continuous word vectors. In our implementation, the input dimension was successfully reduced from 20,000 to 600 when a tri-gram language model is used with a vocabulary of 20,000 words. The total amount of time in training was reduced from 30 days to 14 days for Wall Street Journal training corpus (corpus length: 37M words).

Simple and effective neural coreference resolution for Korean language

  • Park, Cheoneum;Lim, Joonho;Ryu, Jihee;Kim, Hyunki;Lee, Changki
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1038-1048
    • /
    • 2021
  • We propose an end-to-end neural coreference resolution for the Korean language that uses an attention mechanism to point to the same entity. Because Korean is a head-final language, we focused on a method that uses a pointer network based on the head. The key idea is to consider all nouns in the document as candidates based on the head-final characteristics of the Korean language and learn distributions over the referenced entity positions for each noun. Given the recent success of applications using bidirectional encoder representation from transformer (BERT) in natural language-processing tasks, we employed BERT in the proposed model to create word representations based on contextual information. The experimental results indicated that the proposed model achieved state-of-the-art performance in Korean language coreference resolution.

Automatic Evaluation of Elementary School English Writing Based on Recurrent Neural Network Language Model (순환 신경망 기반 언어 모델을 활용한 초등 영어 글쓰기 자동 평가)

  • Park, Youngki
    • Journal of The Korean Association of Information Education
    • /
    • v.21 no.2
    • /
    • pp.161-169
    • /
    • 2017
  • We often use spellcheckers in order to correct the syntactic errors in our documents. However, these computer programs are not enough for elementary school students, because their sentences are not smooth even after correcting the syntactic errors in many cases. In this paper, we introduce an automated method for evaluating the smoothness of two synonymous sentences. This method uses a recurrent neural network to solve the problem of long-term dependencies and exploits subwords to cope with the rare word problem. We trained the recurrent neural network language model based on a monolingual corpus of about two million English sentences. In our experiments, the trained model successfully selected the more smooth sentences for all of nine types of test set. We expect that our approach will help in elementary school writing after being implemented as an application for smart devices.

Simultaneous neural machine translation with a reinforced attention mechanism

  • Lee, YoHan;Shin, JongHun;Kim, YoungKil
    • ETRI Journal
    • /
    • v.43 no.5
    • /
    • pp.775-786
    • /
    • 2021
  • To translate in real time, a simultaneous translation system should determine when to stop reading source tokens and generate target tokens corresponding to a partial source sentence read up to that point. However, conventional attention-based neural machine translation (NMT) models cannot produce translations with adequate latency in online scenarios because they wait until a source sentence is completed to compute alignment between the source and target tokens. To address this issue, we propose a reinforced learning (RL)-based attention mechanism, the reinforced attention mechanism, which allows a neural translation model to jointly train the stopping criterion and a partial translation model. The proposed attention mechanism comprises two modules, one to ensure translation quality and the other to address latency. Different from previous RL-based simultaneous translation systems, which learn the stopping criterion from a fixed NMT model, the modules can be trained jointly with a novel reward function. In our experiments, the proposed model has better translation quality and comparable latency compared to previous models.

Improving Wind Speed Forecasts Using Deep Neural Network

  • Hong, Seokmin;Ku, SungKwan
    • International Journal of Advanced Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.327-333
    • /
    • 2019
  • Wind speed data constitute important weather information for aircrafts flying at low altitudes, such as drones. Currently, the accuracy of low altitude wind predictions is much lower than that of high-altitude wind predictions. Deep neural networks are proposed in this study as a method to improve wind speed forecast information. Deep neural networks mimic the learning process of the interactions among neurons in the brain, and it is used in various fields, such as recognition of image, sound, and texts, image and natural language processing, and pattern recognition in time-series. In this study, the deep neural network model is constructed using the wind prediction values generated by the numerical model as an input to improve the wind speed forecasts. Using the ground wind speed forecast data collected at the Boseong Meteorological Observation Tower, wind speed forecast values obtained by the numerical model are compared with those obtained by the model proposed in this study for the verification of the validity and compatibility of the proposed model.

Image Caption Generation using Recurrent Neural Network (Recurrent Neural Network를 이용한 이미지 캡션 생성)

  • Lee, Changki
    • Journal of KIISE
    • /
    • v.43 no.8
    • /
    • pp.878-882
    • /
    • 2016
  • Automatic generation of captions for an image is a very difficult task, due to the necessity of computer vision and natural language processing technologies. However, this task has many important applications, such as early childhood education, image retrieval, and navigation for blind. In this paper, we describe a Recurrent Neural Network (RNN) model for generating image captions, which takes image features extracted from a Convolutional Neural Network (CNN). We demonstrate that our models produce state of the art results in image caption generation experiments on the Flickr 8K, Flickr 30K, and MS COCO datasets.

Text Classification on Social Network Platforms Based on Deep Learning Models

  • YA, Chen;Tan, Juan;Hoekyung, Jung
    • Journal of information and communication convergence engineering
    • /
    • v.21 no.1
    • /
    • pp.9-16
    • /
    • 2023
  • The natural language on social network platforms has a certain front-to-back dependency in structure, and the direct conversion of Chinese text into a vector makes the dimensionality very high, thereby resulting in the low accuracy of existing text classification methods. To this end, this study establishes a deep learning model that combines a big data ultra-deep convolutional neural network (UDCNN) and long short-term memory network (LSTM). The deep structure of UDCNN is used to extract the features of text vector classification. The LSTM stores historical information to extract the context dependency of long texts, and word embedding is introduced to convert the text into low-dimensional vectors. Experiments are conducted on the social network platforms Sogou corpus and the University HowNet Chinese corpus. The research results show that compared with CNN + rand, LSTM, and other models, the neural network deep learning hybrid model can effectively improve the accuracy of text classification.

FAGON: Fake News Detection Model Using Grammatical Transformation on Deep Neural Network

  • Seo, Youngkyung;Han, Seong-Soo;Jeon, You-Boo;Jeong, Chang-Sung
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.10
    • /
    • pp.4958-4970
    • /
    • 2019
  • As technology advances, the amount of fake news is increasing more and more by various reasons such as political issues and advertisement exaggeration. However, there have been very few research works on fake news detection, especially which uses grammatical transformation on deep neural network. In this paper, we shall present a new Fake News Detection Model, called FAGON(Fake news detection model using Grammatical transformation On deep Neural network) which determines efficiently if the proposition is true or not for the given article by learning grammatical transformation on neural network. Especially, our model focuses the Korean language. It consists of two modules: sentence generator and classification. The former generates multiple sentences which have the same meaning as the proposition, but with different grammar by training the grammatical transformation. The latter classifies the proposition as true or false by training with vectors generated from each sentence of the article and the multiple sentences obtained from the former model respectively. We shall show that our model is designed to detect fake news effectively by exploiting various grammatical transformation and proper classification structure.

A Unicode based Deep Handwritten Character Recognition model for Telugu to English Language Translation

  • BV Subba Rao;J. Nageswara Rao;Bandi Vamsi;Venkata Nagaraju Thatha;Katta Subba Rao
    • International Journal of Computer Science & Network Security
    • /
    • v.24 no.2
    • /
    • pp.101-112
    • /
    • 2024
  • Telugu language is considered as fourth most used language in India especially in the regions of Andhra Pradesh, Telangana, Karnataka etc. In international recognized countries also, Telugu is widely growing spoken language. This language comprises of different dependent and independent vowels, consonants and digits. In this aspect, the enhancement of Telugu Handwritten Character Recognition (HCR) has not been propagated. HCR is a neural network technique of converting a documented image to edited text one which can be used for many other applications. This reduces time and effort without starting over from the beginning every time. In this work, a Unicode based Handwritten Character Recognition(U-HCR) is developed for translating the handwritten Telugu characters into English language. With the use of Centre of Gravity (CG) in our model we can easily divide a compound character into individual character with the help of Unicode values. For training this model, we have used both online and offline Telugu character datasets. To extract the features in the scanned image we used convolutional neural network along with Machine Learning classifiers like Random Forest and Support Vector Machine. Stochastic Gradient Descent (SGD), Root Mean Square Propagation (RMS-P) and Adaptative Moment Estimation (ADAM)optimizers are used in this work to enhance the performance of U-HCR and to reduce the loss function value. This loss value reduction can be possible with optimizers by using CNN. In both online and offline datasets, proposed model showed promising results by maintaining the accuracies with 90.28% for SGD, 96.97% for RMS-P and 93.57% for ADAM respectively.