• Title/Summary/Keyword: 임베딩 모델

Search Result 253, Processing Time 0.02 seconds

BERT-based Classification Model for Korean Documents (한국어 기술문서 분석을 위한 BERT 기반의 분류모델)

  • Hwang, Sangheum;Kim, Dohyun
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.1
    • /
    • pp.203-214
    • /
    • 2020
  • It is necessary to classify technical documents such as patents, R&D project reports in order to understand the trends of technology convergence and interdisciplinary joint research, technology development and so on. Text mining techniques have been mainly used to classify these technical documents. However, in the case of classifying technical documents by text mining algorithms, there is a disadvantage that the features representing technical documents must be directly extracted. In this study, we propose a BERT-based document classification model to automatically extract document features from text information of national R&D projects and to classify them. Then, we verify the applicability and performance of the proposed model for classifying documents.

TeGCN:Transformer-embedded Graph Neural Network for Thin-filer default prediction (TeGCN:씬파일러 신용평가를 위한 트랜스포머 임베딩 기반 그래프 신경망 구조 개발)

  • Seongsu Kim;Junho Bae;Juhyeon Lee;Heejoo Jung;Hee-Woong Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.419-437
    • /
    • 2023
  • As the number of thin filers in Korea surpasses 12 million, there is a growing interest in enhancing the accuracy of assessing their credit default risk to generate additional revenue. Specifically, researchers are actively pursuing the development of default prediction models using machine learning and deep learning algorithms, in contrast to traditional statistical default prediction methods, which struggle to capture nonlinearity. Among these efforts, Graph Neural Network (GNN) architecture is noteworthy for predicting default in situations with limited data on thin filers. This is due to their ability to incorporate network information between borrowers alongside conventional credit-related data. However, prior research employing graph neural networks has faced limitations in effectively handling diverse categorical variables present in credit information. In this study, we introduce the Transformer embedded Graph Convolutional Network (TeGCN), which aims to address these limitations and enable effective default prediction for thin filers. TeGCN combines the TabTransformer, capable of extracting contextual information from categorical variables, with the Graph Convolutional Network, which captures network information between borrowers. Our TeGCN model surpasses the baseline model's performance across both the general borrower dataset and the thin filer dataset. Specially, our model performs outstanding results in thin filer default prediction. This study achieves high default prediction accuracy by a model structure tailored to characteristics of credit information containing numerous categorical variables, especially in the context of thin filers with limited data. Our study can contribute to resolving the financial exclusion issues faced by thin filers and facilitate additional revenue within the financial industry.

Developing a New Algorithm for Conversational Agent to Detect Recognition Error and Neologism Meaning: Utilizing Korean Syllable-based Word Similarity (대화형 에이전트 인식오류 및 신조어 탐지를 위한 알고리즘 개발: 한글 음절 분리 기반의 단어 유사도 활용)

  • Jung-Won Lee;Il Im
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.267-286
    • /
    • 2023
  • The conversational agents such as AI speakers utilize voice conversation for human-computer interaction. Voice recognition errors often occur in conversational situations. Recognition errors in user utterance records can be categorized into two types. The first type is misrecognition errors, where the agent fails to recognize the user's speech entirely. The second type is misinterpretation errors, where the user's speech is recognized and services are provided, but the interpretation differs from the user's intention. Among these, misinterpretation errors require separate error detection as they are recorded as successful service interactions. In this study, various text separation methods were applied to detect misinterpretation. For each of these text separation methods, the similarity of consecutive speech pairs using word embedding and document embedding techniques, which convert words and documents into vectors. This approach goes beyond simple word-based similarity calculation to explore a new method for detecting misinterpretation errors. The research method involved utilizing real user utterance records to train and develop a detection model by applying patterns of misinterpretation error causes. The results revealed that the most significant analysis result was obtained through initial consonant extraction for detecting misinterpretation errors caused by the use of unregistered neologisms. Through comparison with other separation methods, different error types could be observed. This study has two main implications. First, for misinterpretation errors that are difficult to detect due to lack of recognition, the study proposed diverse text separation methods and found a novel method that improved performance remarkably. Second, if this is applied to conversational agents or voice recognition services requiring neologism detection, patterns of errors occurring from the voice recognition stage can be specified. The study proposed and verified that even if not categorized as errors, services can be provided according to user-desired results.

A Study on Searching for Export Candidate Countries of the Korean Food and Beverage Industry Using Node2vec Graph Embedding and Light GBM Link Prediction (Node2vec 그래프 임베딩과 Light GBM 링크 예측을 활용한 식음료 산업의 수출 후보국가 탐색 연구)

  • Lee, Jae-Seong;Jun, Seung-Pyo;Seo, Jinny
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.4
    • /
    • pp.73-95
    • /
    • 2021
  • This study uses Node2vec graph embedding method and Light GBM link prediction to explore undeveloped export candidate countries in Korea's food and beverage industry. Node2vec is the method that improves the limit of the structural equivalence representation of the network, which is known to be relatively weak compared to the existing link prediction method based on the number of common neighbors of the network. Therefore, the method is known to show excellent performance in both community detection and structural equivalence of the network. The vector value obtained by embedding the network in this way operates under the condition of a constant length from an arbitrarily designated starting point node. Therefore, it has the advantage that it is easy to apply the sequence of nodes as an input value to the model for downstream tasks such as Logistic Regression, Support Vector Machine, and Random Forest. Based on these features of the Node2vec graph embedding method, this study applied the above method to the international trade information of the Korean food and beverage industry. Through this, we intend to contribute to creating the effect of extensive margin diversification in Korea in the global value chain relationship of the industry. The optimal predictive model derived from the results of this study recorded a precision of 0.95 and a recall of 0.79, and an F1 score of 0.86, showing excellent performance. This performance was shown to be superior to that of the binary classifier based on Logistic Regression set as the baseline model. In the baseline model, a precision of 0.95 and a recall of 0.73 were recorded, and an F1 score of 0.83 was recorded. In addition, the light GBM-based optimal prediction model derived from this study showed superior performance than the link prediction model of previous studies, which is set as a benchmarking model in this study. The predictive model of the previous study recorded only a recall rate of 0.75, but the proposed model of this study showed better performance which recall rate is 0.79. The difference in the performance of the prediction results between benchmarking model and this study model is due to the model learning strategy. In this study, groups were classified by the trade value scale, and prediction models were trained differently for these groups. Specific methods are (1) a method of randomly masking and learning a model for all trades without setting specific conditions for trade value, (2) arbitrarily masking a part of the trades with an average trade value or higher and using the model method, and (3) a method of arbitrarily masking some of the trades with the top 25% or higher trade value and learning the model. As a result of the experiment, it was confirmed that the performance of the model trained by randomly masking some of the trades with the above-average trade value in this method was the best and appeared stably. It was found that most of the results of potential export candidates for Korea derived through the above model appeared appropriate through additional investigation. Combining the above, this study could suggest the practical utility of the link prediction method applying Node2vec and Light GBM. In addition, useful implications could be derived for weight update strategies that can perform better link prediction while training the model. On the other hand, this study also has policy utility because it is applied to trade transactions that have not been performed much in the research related to link prediction based on graph embedding. The results of this study support a rapid response to changes in the global value chain such as the recent US-China trade conflict or Japan's export regulations, and I think that it has sufficient usefulness as a tool for policy decision-making.

Design of a Deep Neural Network Model for Image Caption Generation (이미지 캡션 생성을 위한 심층 신경망 모델의 설계)

  • Kim, Dongha;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.6 no.4
    • /
    • pp.203-210
    • /
    • 2017
  • In this paper, we propose an effective neural network model for image caption generation and model transfer. This model is a kind of multi-modal recurrent neural network models. It consists of five distinct layers: a convolution neural network layer for extracting visual information from images, an embedding layer for converting each word into a low dimensional feature, a recurrent neural network layer for learning caption sentence structure, and a multi-modal layer for combining visual and language information. In this model, the recurrent neural network layer is constructed by LSTM units, which are well known to be effective for learning and transferring sequence patterns. Moreover, this model has a unique structure in which the output of the convolution neural network layer is linked not only to the input of the initial state of the recurrent neural network layer but also to the input of the multimodal layer, in order to make use of visual information extracted from the image at each recurrent step for generating the corresponding textual caption. Through various comparative experiments using open data sets such as Flickr8k, Flickr30k, and MSCOCO, we demonstrated the proposed multimodal recurrent neural network model has high performance in terms of caption accuracy and model transfer effect.

Similar Contents Recommendation Model Based On Contents Meta Data Using Language Model (언어모델을 활용한 콘텐츠 메타 데이터 기반 유사 콘텐츠 추천 모델)

  • Donghwan Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.1
    • /
    • pp.27-40
    • /
    • 2023
  • With the increase in the spread of smart devices and the impact of COVID-19, the consumption of media contents through smart devices has significantly increased. Along with this trend, the amount of media contents viewed through OTT platforms is increasing, that makes contents recommendations on these platforms more important. Previous contents-based recommendation researches have mostly utilized metadata that describes the characteristics of the contents, with a shortage of researches that utilize the contents' own descriptive metadata. In this paper, various text data including titles and synopses that describe the contents were used to recommend similar contents. KLUE-RoBERTa-large, a Korean language model with excellent performance, was used to train the model on the text data. A dataset of over 20,000 contents metadata including titles, synopses, composite genres, directors, actors, and hash tags information was used as training data. To enter the various text features into the language model, the features were concatenated using special tokens that indicate each feature. The test set was designed to promote the relative and objective nature of the model's similarity classification ability by using the three contents comparison method and applying multiple inspections to label the test set. Genres classification and hash tag classification prediction tasks were used to fine-tune the embeddings for the contents meta text data. As a result, the hash tag classification model showed an accuracy of over 90% based on the similarity test set, which was more than 9% better than the baseline language model. Through hash tag classification training, it was found that the language model's ability to classify similar contents was improved, which demonstrated the value of using a language model for the contents-based filtering.

Multi-site based earthquake event classification using graph convolution networks (그래프 합성곱 신경망을 이용한 다중 관측소 기반 지진 이벤트 분류)

  • Kim, Gwantae;Ku, Bonhwa;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.6
    • /
    • pp.615-621
    • /
    • 2020
  • In this paper, we propose a multi-site based earthquake event classification method using graph convolution networks. In the traditional earthquake event classification methods using deep learning, they used single-site observation to estimate seismic event class. However, to achieve robust and accurate earthquake event classification on the seismic observation network, the method using the information from the multi-site observations is needed, instead of using only single-site data. Firstly, our proposed model employs convolution neural networks to extract informative embedding features from the single-site observation. Secondly, graph convolution networks are used to integrate the features from several stations. To evaluate our model, we explore the model structure and the number of stations for ablation study. Finally, our multi-site based model outperforms up to 10 % accuracy and event recall rate compared to single-site based model.

Development of a Fake News Detection Model Using Text Mining and Deep Learning Algorithms (텍스트 마이닝과 딥러닝 알고리즘을 이용한 가짜 뉴스 탐지 모델 개발)

  • Dong-Hoon Lim;Gunwoo Kim;Keunho Choi
    • Information Systems Review
    • /
    • v.23 no.4
    • /
    • pp.127-146
    • /
    • 2021
  • Fake news isexpanded and reproduced rapidly regardless of their authenticity by the characteristics of modern society, called the information age. Assuming that 1% of all news are fake news, the amount of economic costs is reported to about 30 trillion Korean won. This shows that the fake news isvery important social and economic issue. Therefore, this study aims to develop an automated detection model to quickly and accurately verify the authenticity of the news. To this end, this study crawled the news data whose authenticity is verified, and developed fake news prediction models using word embedding (Word2Vec, Fasttext) and deep learning algorithms (LSTM, BiLSTM). Experimental results show that the prediction model using BiLSTM with Word2Vec achieved the best accuracy of 84%.

Research on the Utilization of Recurrent Neural Networks for Automatic Generation of Korean Definitional Sentences of Technical Terms (기술 용어에 대한 한국어 정의 문장 자동 생성을 위한 순환 신경망 모델 활용 연구)

  • Choi, Garam;Kim, Han-Gook;Kim, Kwang-Hoon;Kim, You-eil;Choi, Sung-Pil
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.51 no.4
    • /
    • pp.99-120
    • /
    • 2017
  • In order to develop a semiautomatic support system that allows researchers concerned to efficiently analyze the technical trends for the ever-growing industry and market. This paper introduces a couple of Korean sentence generation models that can automatically generate definitional statements as well as descriptions of technical terms and concepts. The proposed models are based on a deep learning model called LSTM (Long Sort-Term Memory) capable of effectively labeling textual sequences by taking into account the contextual relations of each item in the sequences. Our models take technical terms as inputs and can generate a broad range of heterogeneous textual descriptions that explain the concept of the terms. In the experiments using large-scale training collections, we confirmed that more accurate and reasonable sentences can be generated by CHAR-CNN-LSTM model that is a word-based LSTM exploiting character embeddings based on convolutional neural networks (CNN). The results of this study can be a force for developing an extension model that can generate a set of sentences covering the same subjects, and furthermore, we can implement an artificial intelligence model that automatically creates technical literature.

Generative Multi-Turn Chatbot Using Generative Adversarial Network (생성적 적대적 신경망을 이용한 생성기반 멀티턴 챗봇)

  • Kim, Jintae;Kim, Harksoo;Kwon, Oh-Woog;Kim, Young-Gil
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.25-30
    • /
    • 2018
  • 기존의 검색 기반 챗봇 시스템과 다르게 생성 기반 챗봇 시스템은 사전에 정의된 응답에 의존하지 않고 채팅 말뭉치를 학습한 신경망 모델을 사용하여 응답을 생성한다. 생성 기반 챗봇 시스템이 사람과 같이 자연스러운 응답을 생성하려면 이전 문맥을 반영해야 할 필요가 있다. 기존 연구에서는 문맥을 반영하기 위해 이전 문맥과 입력 발화를 통합하여 하나의 벡터로 표현했다. 이러한 경우 이전 문맥과 입력 발화가 분리되어 있지 않아 이전 문맥이 필요하지 않는 경우 잡음으로 작용할 수 있다. 본 논문은 이러한 문제를 해결하기 위해 입력 발화와 이전 문맥을 각각의 벡터로 표현하는 방법을 제안한다. 또한 생성적 적대적 신경망을 통해 챗봇 시스템을 보강하는 방법을 제안한다. 채팅 말뭉치(55,000 개의 학습 데이터, 5,000개의 검증 데이터, 5,260 개의 평가 데이터)를 사용한 실험에서 제안한 문맥 반영 방법과 생성적 적대적 신경망을 통한 챗봇 시스템 보강 방법은 BLEU와 임베딩 기반 평가의 성능 향상에 도움을 주었다.

  • PDF