• Title/Summary/Keyword: Deep Learning based System

Search Result 1,206, Processing Time 0.025 seconds

Korean Sentence Generation Using Phoneme-Level LSTM Language Model (한국어 음소 단위 LSTM 언어모델을 이용한 문장 생성)

  • Ahn, SungMahn;Chung, Yeojin;Lee, Jaejoon;Yang, Jiheon
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.71-88
    • /
    • 2017
  • Language models were originally developed for speech recognition and language processing. Using a set of example sentences, a language model predicts the next word or character based on sequential input data. N-gram models have been widely used but this model cannot model the correlation between the input units efficiently since it is a probabilistic model which are based on the frequency of each unit in the training set. Recently, as the deep learning algorithm has been developed, a recurrent neural network (RNN) model and a long short-term memory (LSTM) model have been widely used for the neural language model (Ahn, 2016; Kim et al., 2016; Lee et al., 2016). These models can reflect dependency between the objects that are entered sequentially into the model (Gers and Schmidhuber, 2001; Mikolov et al., 2010; Sundermeyer et al., 2012). In order to learning the neural language model, texts need to be decomposed into words or morphemes. Since, however, a training set of sentences includes a huge number of words or morphemes in general, the size of dictionary is very large and so it increases model complexity. In addition, word-level or morpheme-level models are able to generate vocabularies only which are contained in the training set. Furthermore, with highly morphological languages such as Turkish, Hungarian, Russian, Finnish or Korean, morpheme analyzers have more chance to cause errors in decomposition process (Lankinen et al., 2016). Therefore, this paper proposes a phoneme-level language model for Korean language based on LSTM models. A phoneme such as a vowel or a consonant is the smallest unit that comprises Korean texts. We construct the language model using three or four LSTM layers. Each model was trained using Stochastic Gradient Algorithm and more advanced optimization algorithms such as Adagrad, RMSprop, Adadelta, Adam, Adamax, and Nadam. Simulation study was done with Old Testament texts using a deep learning package Keras based the Theano. After pre-processing the texts, the dataset included 74 of unique characters including vowels, consonants, and punctuation marks. Then we constructed an input vector with 20 consecutive characters and an output with a following 21st character. Finally, total 1,023,411 sets of input-output vectors were included in the dataset and we divided them into training, validation, testsets with proportion 70:15:15. All the simulation were conducted on a system equipped with an Intel Xeon CPU (16 cores) and a NVIDIA GeForce GTX 1080 GPU. We compared the loss function evaluated for the validation set, the perplexity evaluated for the test set, and the time to be taken for training each model. As a result, all the optimization algorithms but the stochastic gradient algorithm showed similar validation loss and perplexity, which are clearly superior to those of the stochastic gradient algorithm. The stochastic gradient algorithm took the longest time to be trained for both 3- and 4-LSTM models. On average, the 4-LSTM layer model took 69% longer training time than the 3-LSTM layer model. However, the validation loss and perplexity were not improved significantly or became even worse for specific conditions. On the other hand, when comparing the automatically generated sentences, the 4-LSTM layer model tended to generate the sentences which are closer to the natural language than the 3-LSTM model. Although there were slight differences in the completeness of the generated sentences between the models, the sentence generation performance was quite satisfactory in any simulation conditions: they generated only legitimate Korean letters and the use of postposition and the conjugation of verbs were almost perfect in the sense of grammar. The results of this study are expected to be widely used for the processing of Korean language in the field of language processing and speech recognition, which are the basis of artificial intelligence systems.

Development of Mask-RCNN Model for Detecting Greenhouses Based on Satellite Image (위성이미지 기반 시설하우스 판별 Mask-RCNN 모델 개발)

  • Kim, Yun Seok;Heo, Seong;Yoon, Seong Uk;Ahn, Jinhyun;Choi, Inchan;Chang, Sungyul;Lee, Seung-Jae;Chung, Yong Suk
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.23 no.3
    • /
    • pp.156-162
    • /
    • 2021
  • The number of smart farms has increased to save labor in agricultural production as the subsidy become available from central and local governments. The number of illegal greenhouses has also increased, which causes serious issues for the local governments. In the present study, we developed Mask-RCNN model to detect greenhouses based on satellite images. Greenhouses in the satellite images were labeled for training and validation of the model. The Mask-RC NN model had the average precision (AP) of 75.6%. The average precision values for 50% and 75% of overlapping area were 91.1% and 81.8%, respectively. This results indicated that the Mask-RC NN model would be useful to detect the greenhouses recently built without proper permission using a periodical screening procedure based on satellite images. Furthermore, the model can be connected with GIS to establish unified management system for greenhouses. It can also be applied to the statistical analysis of the number and total area of greenhouses.

Design and Implementation of BNN based Human Identification and Motion Classification System Using CW Radar (연속파 레이다를 활용한 이진 신경망 기반 사람 식별 및 동작 분류 시스템 설계 및 구현)

  • Kim, Kyeong-min;Kim, Seong-jin;NamKoong, Ho-jung;Jung, Yun-ho
    • Journal of Advanced Navigation Technology
    • /
    • v.26 no.4
    • /
    • pp.211-218
    • /
    • 2022
  • Continuous wave (CW) radar has the advantage of reliability and accuracy compared to other sensors such as camera and lidar. In addition, binarized neural network (BNN) has a characteristic that dramatically reduces memory usage and complexity compared to other deep learning networks. Therefore, this paper proposes binarized neural network based human identification and motion classification system using CW radar. After receiving a signal from CW radar, a spectrogram is generated through a short-time Fourier transform (STFT). Based on this spectrogram, we propose an algorithm that detects whether a person approaches a radar. Also, we designed an optimized BNN model that can support the accuracy of 90.0% for human identification and 98.3% for motion classification. In order to accelerate BNN operation, we designed BNN hardware accelerator on field programmable gate array (FPGA). The accelerator was implemented with 1,030 logics, 836 registers, and 334.904 Kbit block memory, and it was confirmed that the real-time operation was possible with a total calculation time of 6 ms from inference to transferring result.

Fake News Detection on YouTube Using Related Video Information (관련 동영상 정보를 활용한 YouTube 가짜뉴스 탐지 기법)

  • Junho Kim;Yongjun Shin;Hyunchul Ahn
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.19-36
    • /
    • 2023
  • As advances in information and communication technology have made it easier for anyone to produce and disseminate information, a new problem has emerged: fake news, which is false information intentionally shared to mislead people. Initially spread mainly through text, fake news has gradually evolved and is now distributed in multimedia formats. Since its founding in 2005, YouTube has become the world's leading video platform and is used by most people worldwide. However, it has also become a primary source of fake news, causing social problems. Various researchers have been working on detecting fake news on YouTube. There are content-based and background information-based approaches to fake news detection. Still, content-based approaches are dominant when looking at conventional fake news research and YouTube fake news detection research. This study proposes a fake news detection method based on background information rather than content-based fake news detection. In detail, we suggest detecting fake news by utilizing related video information from YouTube. Specifically, the method detects fake news through CNN, a deep learning network, from the vectorized information obtained from related videos and the original video using Doc2vec, an embedding technique. The empirical analysis shows that the proposed method has better prediction performance than the existing content-based approach to detecting fake news on YouTube. The proposed method in this study contributes to making our society safer and more reliable by preventing the spread of fake news on YouTube, which is highly contagious.

Predictive Clustering-based Collaborative Filtering Technique for Performance-Stability of Recommendation System (추천 시스템의 성능 안정성을 위한 예측적 군집화 기반 협업 필터링 기법)

  • Lee, O-Joun;You, Eun-Soon
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.119-142
    • /
    • 2015
  • With the explosive growth in the volume of information, Internet users are experiencing considerable difficulties in obtaining necessary information online. Against this backdrop, ever-greater importance is being placed on a recommender system that provides information catered to user preferences and tastes in an attempt to address issues associated with information overload. To this end, a number of techniques have been proposed, including content-based filtering (CBF), demographic filtering (DF) and collaborative filtering (CF). Among them, CBF and DF require external information and thus cannot be applied to a variety of domains. CF, on the other hand, is widely used since it is relatively free from the domain constraint. The CF technique is broadly classified into memory-based CF, model-based CF and hybrid CF. Model-based CF addresses the drawbacks of CF by considering the Bayesian model, clustering model or dependency network model. This filtering technique not only improves the sparsity and scalability issues but also boosts predictive performance. However, it involves expensive model-building and results in a tradeoff between performance and scalability. Such tradeoff is attributed to reduced coverage, which is a type of sparsity issues. In addition, expensive model-building may lead to performance instability since changes in the domain environment cannot be immediately incorporated into the model due to high costs involved. Cumulative changes in the domain environment that have failed to be reflected eventually undermine system performance. This study incorporates the Markov model of transition probabilities and the concept of fuzzy clustering with CBCF to propose predictive clustering-based CF (PCCF) that solves the issues of reduced coverage and of unstable performance. The method improves performance instability by tracking the changes in user preferences and bridging the gap between the static model and dynamic users. Furthermore, the issue of reduced coverage also improves by expanding the coverage based on transition probabilities and clustering probabilities. The proposed method consists of four processes. First, user preferences are normalized in preference clustering. Second, changes in user preferences are detected from review score entries during preference transition detection. Third, user propensities are normalized using patterns of changes (propensities) in user preferences in propensity clustering. Lastly, the preference prediction model is developed to predict user preferences for items during preference prediction. The proposed method has been validated by testing the robustness of performance instability and scalability-performance tradeoff. The initial test compared and analyzed the performance of individual recommender systems each enabled by IBCF, CBCF, ICFEC and PCCF under an environment where data sparsity had been minimized. The following test adjusted the optimal number of clusters in CBCF, ICFEC and PCCF for a comparative analysis of subsequent changes in the system performance. The test results revealed that the suggested method produced insignificant improvement in performance in comparison with the existing techniques. In addition, it failed to achieve significant improvement in the standard deviation that indicates the degree of data fluctuation. Notwithstanding, it resulted in marked improvement over the existing techniques in terms of range that indicates the level of performance fluctuation. The level of performance fluctuation before and after the model generation improved by 51.31% in the initial test. Then in the following test, there has been 36.05% improvement in the level of performance fluctuation driven by the changes in the number of clusters. This signifies that the proposed method, despite the slight performance improvement, clearly offers better performance stability compared to the existing techniques. Further research on this study will be directed toward enhancing the recommendation performance that failed to demonstrate significant improvement over the existing techniques. The future research will consider the introduction of a high-dimensional parameter-free clustering algorithm or deep learning-based model in order to improve performance in recommendations.

Content-based Korean journal recommendation system using Sentence BERT (Sentence BERT를 이용한 내용 기반 국문 저널추천 시스템)

  • Yongwoo Kim;Daeyoung Kim;Hyunhee Seo;Young-Min Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.37-55
    • /
    • 2023
  • With the development of electronic journals and the emergence of various interdisciplinary studies, the selection of journals for publication has become a new challenge for researchers. Even if a paper is of high quality, it may face rejection due to a mismatch between the paper's topic and the scope of the journal. While research on assisting researchers in journal selection has been actively conducted in English, the same cannot be said for Korean journals. In this study, we propose a system that recommends Korean journals for submission. Firstly, we utilize SBERT (Sentence BERT) to embed abstracts of previously published papers at the document level, compare the similarity between new documents and published papers, and recommend journals accordingly. Next, the order of recommended journals is determined by considering the similarity of abstracts, keywords, and title. Subsequently, journals that are similar to the top recommended journal from previous stage are added by using a dictionary of words constructed for each journal, thereby enhancing recommendation diversity. The recommendation system, built using this approach, achieved a Top-10 accuracy level of 76.6%, and the validity of the recommendation results was confirmed through user feedback. Furthermore, it was found that each step of the proposed framework contributes to improving recommendation accuracy. This study provides a new approach to recommending academic journals in the Korean language, which has not been actively studied before, and it has also practical implications as the proposed framework can be easily applied to services.

Performance Evaluation and Analysis on Single and Multi-Network Virtualization Systems with Virtio and SR-IOV (가상화 시스템에서 Virtio와 SR-IOV 적용에 대한 단일 및 다중 네트워크 성능 평가 및 분석)

  • Jaehak Lee;Jongbeom Lim;Heonchang Yu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.13 no.2
    • /
    • pp.48-59
    • /
    • 2024
  • As functions that support virtualization on their own in hardware are developed, user applications having various workloads are operating efficiently in the virtualization system. SR-IOV is a virtualization support function that takes direct access to PCI devices, thus giving a high I/O performance by minimizing the need for hypervisor or operating system interventions. With SR-IOV, network I/O acceleration can be realized in virtualization systems that have relatively long I/O paths compared to bare-metal systems and frequent context switches between the user area and kernel area. To take performance advantages of SR-IOV, network resource management policies that can derive optimal network performance when SR-IOV is applied to an instance such as a virtual machine(VM) or container are being actively studied.This paper evaluates and analyzes the network performance of SR-IOV implementing I/O acceleration is compared with Virtio in terms of 1) network delay, 2) network throughput, 3) network fairness, 4) performance interference, and 5) multi-network. The contributions of this paper are as follows. First, the network I/O process of Virtio and SR-IOV was clearly explained in the virtualization system, and second, the evaluation results of the network performance of Virtio and SR-IOV were analyzed based on various performance metrics. Third, the system overhead and the possibility of optimization for the SR-IOV network in a virtualization system with high VM density were experimentally confirmed. The experimental results and analysis of the paper are expected to be referenced in the network resource management policy for virtualization systems that operate network-intensive services such as smart factories, connected cars, deep learning inference models, and crowdsourcing.

Automatic gasometer reading system using selective optical character recognition (관심 문자열 인식 기술을 이용한 가스계량기 자동 검침 시스템)

  • Lee, Kyohyuk;Kim, Taeyeon;Kim, Wooju
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.1-25
    • /
    • 2020
  • In this paper, we suggest an application system architecture which provides accurate, fast and efficient automatic gasometer reading function. The system captures gasometer image using mobile device camera, transmits the image to a cloud server on top of private LTE network, and analyzes the image to extract character information of device ID and gas usage amount by selective optical character recognition based on deep learning technology. In general, there are many types of character in an image and optical character recognition technology extracts all character information in an image. But some applications need to ignore non-of-interest types of character and only have to focus on some specific types of characters. For an example of the application, automatic gasometer reading system only need to extract device ID and gas usage amount character information from gasometer images to send bill to users. Non-of-interest character strings, such as device type, manufacturer, manufacturing date, specification and etc., are not valuable information to the application. Thus, the application have to analyze point of interest region and specific types of characters to extract valuable information only. We adopted CNN (Convolutional Neural Network) based object detection and CRNN (Convolutional Recurrent Neural Network) technology for selective optical character recognition which only analyze point of interest region for selective character information extraction. We build up 3 neural networks for the application system. The first is a convolutional neural network which detects point of interest region of gas usage amount and device ID information character strings, the second is another convolutional neural network which transforms spatial information of point of interest region to spatial sequential feature vectors, and the third is bi-directional long short term memory network which converts spatial sequential information to character strings using time-series analysis mapping from feature vectors to character strings. In this research, point of interest character strings are device ID and gas usage amount. Device ID consists of 12 arabic character strings and gas usage amount consists of 4 ~ 5 arabic character strings. All system components are implemented in Amazon Web Service Cloud with Intel Zeon E5-2686 v4 CPU and NVidia TESLA V100 GPU. The system architecture adopts master-lave processing structure for efficient and fast parallel processing coping with about 700,000 requests per day. Mobile device captures gasometer image and transmits to master process in AWS cloud. Master process runs on Intel Zeon CPU and pushes reading request from mobile device to an input queue with FIFO (First In First Out) structure. Slave process consists of 3 types of deep neural networks which conduct character recognition process and runs on NVidia GPU module. Slave process is always polling the input queue to get recognition request. If there are some requests from master process in the input queue, slave process converts the image in the input queue to device ID character string, gas usage amount character string and position information of the strings, returns the information to output queue, and switch to idle mode to poll the input queue. Master process gets final information form the output queue and delivers the information to the mobile device. We used total 27,120 gasometer images for training, validation and testing of 3 types of deep neural network. 22,985 images were used for training and validation, 4,135 images were used for testing. We randomly splitted 22,985 images with 8:2 ratio for training and validation respectively for each training epoch. 4,135 test image were categorized into 5 types (Normal, noise, reflex, scale and slant). Normal data is clean image data, noise means image with noise signal, relfex means image with light reflection in gasometer region, scale means images with small object size due to long-distance capturing and slant means images which is not horizontally flat. Final character string recognition accuracies for device ID and gas usage amount of normal data are 0.960 and 0.864 respectively.

Dual CNN Structured Sound Event Detection Algorithm Based on Real Life Acoustic Dataset (실생활 음향 데이터 기반 이중 CNN 구조를 특징으로 하는 음향 이벤트 인식 알고리즘)

  • Suh, Sangwon;Lim, Wootaek;Jeong, Youngho;Lee, Taejin;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.23 no.6
    • /
    • pp.855-865
    • /
    • 2018
  • Sound event detection is one of the research areas to model human auditory cognitive characteristics by recognizing events in an environment with multiple acoustic events and determining the onset and offset time for each event. DCASE, a research group on acoustic scene classification and sound event detection, is proceeding challenges to encourage participation of researchers and to activate sound event detection research. However, the size of the dataset provided by the DCASE Challenge is relatively small compared to ImageNet, which is a representative dataset for visual object recognition, and there are not many open sources for the acoustic dataset. In this study, the sound events that can occur in indoor and outdoor are collected on a larger scale and annotated for dataset construction. Furthermore, to improve the performance of the sound event detection task, we developed a dual CNN structured sound event detection system by adding a supplementary neural network to a convolutional neural network to determine the presence of sound events. Finally, we conducted a comparative experiment with both baseline systems of the DCASE 2016 and 2017.

Prediction of Dormant Customer in the Card Industry (카드산업에서 휴면 고객 예측)

  • DongKyu Lee;Minsoo Shin
    • Journal of Service Research and Studies
    • /
    • v.13 no.2
    • /
    • pp.99-113
    • /
    • 2023
  • In a customer-based industry, customer retention is the competitiveness of a company, and improving customer retention improves the competitiveness of the company. Therefore, accurate prediction and management of potential dormant customers is paramount to increasing the competitiveness of the enterprise. In particular, there are numerous competitors in the domestic card industry, and the government is introducing an automatic closing system for dormant card management. As a result of these social changes, the card industry must focus on better predicting and managing potential dormant cards, and better predicting dormant customers is emerging as an important challenge. In this study, the Recurrent Neural Network (RNN) methodology was used to predict potential dormant customers in the card industry, and in particular, Long-Short Term Memory (LSTM) was used to efficiently learn data for a long time. In addition, to redefine the variables needed to predict dormant customers in the card industry, Unified Theory of Technology (UTAUT), an integrated technology acceptance theory, was applied to redefine and group the variables used in the model. As a result, stable model accuracy and F-1 score were obtained, and Hit-Ratio proved that models using LSTM can produce stable results compared to other algorithms. It was also found that there was no moderating effect of demographic information that could occur in UTAUT, which was pointed out in previous studies. Therefore, among variable selection models using UTAUT, dormant customer prediction models using LSTM are proven to have non-biased stable results. This study revealed that there may be academic contributions to the prediction of dormant customers using LSTM algorithms that can learn well from previously untried time series data. In addition, it is a good example to show that it is possible to respond to customers who are preemptively dormant in terms of customer management because it is predicted at a time difference with the actual dormant capture, and it is expected to contribute greatly to the industry.