• Title/Summary/Keyword: 자연어 이해

Search Result 177, Processing Time 0.026 seconds

Survey on Out-Of-Domain Detection for Dialog Systems (대화시스템 미지원 도메인 검출에 관한 조사)

  • Jeong, Young-Seob;Kim, Young-Min
    • Journal of Convergence for Information Technology
    • /
    • v.9 no.9
    • /
    • pp.1-12
    • /
    • 2019
  • A dialog system becomes a new way of communication between human and computer. The dialog system takes human voice as an input, and gives a proper response in voice or perform an action. Although there are several well-known products of dialog system (e.g., Amazon Echo, Naver Wave), they commonly suffer from a problem of out-of-domain utterances. If it poorly detects out-of-domain utterances, then it will significantly harm the user satisfactory. There have been some studies aimed at solving this problem, but it is still necessary to study about this intensively. In this paper, we give an overview of the previous studies of out-of-domain detection in terms of three point of view: dataset, feature, and method. As there were relatively smaller studies of this topic due to the lack of datasets, we believe that the most important next research step is to construct and share a large dataset for dialog system, and thereafter try state-of-the-art techniques upon the dataset.

Understanding the Categories and Characteristics of Depressive Moods in Chatbot Data (챗봇 데이터에 나타난 우울 담론의 범주와 특성의 이해)

  • Chin, HyoJin;Jung, Chani;Baek, Gumhee;Cha, Chiyoung;Choi, Jeonghoi;Cha, Meeyoung
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.9
    • /
    • pp.381-390
    • /
    • 2022
  • Influenced by a culture that prefers non-face-to-face activity during the COVID-19 pandemic, chatbot usage is accelerating. Chatbots have been used for various purposes, not only for customer service in businesses and social conversations for fun but also for mental health. Chatbots are a platform where users can easily talk about their depressed moods because anonymity is guaranteed. However, most relevant research has been on social media data, especially Twitter data, and few studies have analyzed the commercially used chatbots data. In this study, we identified the characteristics of depressive discourse in user-chatbot interaction data by analyzing the chats, including the word 'depress,' using the topic modeling algorithm and the text-mining technique. Moreover, we compared its characteristics with those of the depressive moods in the Twitter data. Finally, we draw several design guidelines and suggest avenues for future research based on the study findings.

Text summarization of dialogue based on BERT

  • Nam, Wongyung;Lee, Jisoo;Jang, Beakcheol
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.8
    • /
    • pp.41-47
    • /
    • 2022
  • In this paper, we propose how to implement text summaries for colloquial data that are not clearly organized. For this study, SAMSum data, which is colloquial data, was used, and the BERTSumExtAbs model proposed in the previous study of the automatic summary model was applied. More than 70% of the SAMSum dataset consists of conversations between two people, and the remaining 30% consists of conversations between three or more people. As a result, by applying the automatic text summarization model to colloquial data, a result of 42.43 or higher was derived in the ROUGE Score R-1. In addition, a high score of 45.81 was derived by fine-tuning the BERTSum model, which was previously proposed as a text summarization model. Through this study, the performance of colloquial generation summary has been proven, and it is hoped that the computer will understand human natural language as it is and be used as basic data to solve various tasks.

Document Classification Methodology Using Autoencoder-based Keywords Embedding

  • Seobin Yoon;Namgyu Kim
    • Journal of the Korea Society of Computer and Information
    • /
    • v.28 no.9
    • /
    • pp.35-46
    • /
    • 2023
  • In this study, we propose a Dual Approach methodology to enhance the accuracy of document classifiers by utilizing both contextual and keyword information. Firstly, contextual information is extracted using Google's BERT, a pre-trained language model known for its outstanding performance in various natural language understanding tasks. Specifically, we employ KoBERT, a pre-trained model on the Korean corpus, to extract contextual information in the form of the CLS token. Secondly, keyword information is generated for each document by encoding the set of keywords into a single vector using an Autoencoder. We applied the proposed approach to 40,130 documents related to healthcare and medicine from the National R&D Projects database of the National Science and Technology Information Service (NTIS). The experimental results demonstrate that the proposed methodology outperforms existing methods that rely solely on document or word information in terms of accuracy for document classification.

Verification of educational goal of reading area in Korean SAT through natural language processing techniques (대학수학능력시험 독서 영역의 교육 목표를 위한 자연어처리 기법을 통한 검증)

  • Lee, Soomin;Kim, Gyeongmin;Lim, Heuiseok
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.1
    • /
    • pp.81-88
    • /
    • 2022
  • The major educational goal of reading part, which occupies important portion in Korean language in Korean SAT, is to evaluated whether a given text can be fully understood. Therefore given questions in the exam must be able to solely solvable by given text. In this paper we developed a datatset based on Korean SAT's reading part in order to evaluate whether a deep learning language model can classify if the given question is true or false, which is a binary classification task in NLP. In result, by applying language model solely according to the passages in the dataset, we were able to acquire better performance than 59.2% in F1 score for human performance in most of language models, that KoELECTRA scored 62.49% in our experiment. Also we proved that structural limit of language models can be eased by adjusting data preprocess.

A Study on the Definition of Data Literacy for Elementary and Secondary Artificial Intelligence Education (초·중등 인공지능 교육을 위한 데이터 리터러시 정의 연구)

  • Kim, SeulKi;Kim, Taeyoung
    • 한국정보교육학회:학술대회논문집
    • /
    • 2021.08a
    • /
    • pp.59-67
    • /
    • 2021
  • The development of AI technology has brought about a big change in our lives. As AI's influence grows from life to society to the economy, the importance of education on AI and data is also growing. In particular, the OECD Education Research Report and various domestic information and curriculum studies address data literacy and present it as an essential competency. Looking at domestic and international studies, one can see that the definition of data literacy differs in its specific content and scope from researchers to researchers. Thus, the definition of major research related to data literacy was analyzed from various angles and derived from various angles. In key studies, Word2vec natural language processing methods, along with word frequency analysis used to define data literacy, are used to analyze semantic similarities and nominate them based on content elements of curriculum research to derive the definition of 'understanding and using data to process information'. Based on the definition of data literacy derived from this study, we hope that the contents will be revised and supplemented, and more research will be conducted to provide a good foundation for educational research that develops students' future capabilities.

  • PDF

A Knowledge Graph-based Chatbot to Prevent the Leakage of LLM User's Sensitive Information (LLM 사용자의 민감정보 유출 방지를 위한 지식그래프 기반 챗봇)

  • Keedong Yoo
    • Knowledge Management Research
    • /
    • v.25 no.2
    • /
    • pp.1-18
    • /
    • 2024
  • With the increasing demand for and utilization of large language models (LLMs), the risk of user sensitive information being inputted and leaked during the use of LLMs also escalates. Typically recognized as a tool for mitigating the hallucination issues of LLMs, knowledge graphs, constructed independently from LLMs, can store and manage sensitive user information separately, thereby minimizing the potential for data breaches. This study, therefore, presents a knowledge graph-based chatbot that transforms user-inputted natural language questions into queries appropriate for the knowledge graph using LLMs, subsequently executing these queries and extracting the results. Furthermore, to evaluate the functional validity of the developed knowledge graph-based chatbot, performance tests are conducted to assess the comprehension and adaptability to existing knowledge graphs, the capability to create new entity classes, and the accessibility of LLMs to the knowledge graph content.

Exploring Narrative Intelligence in AI: Implications for the Evolution of Homo narrans (인공지능의 서사 지능 탐구 : 새로운 서사 생태계와 호모 나랜스의 진화)

  • Hochang Kwon
    • Trans-
    • /
    • v.16
    • /
    • pp.107-133
    • /
    • 2024
  • Narratives are fundamental to human cognition and social culture, serving as the primary means by which individuals and societies construct meaning, share experiences, and convey cultural and moral values. The field of artificial intelligence, which seeks to mimic human thought and behavior, has long studied story generation and story understanding, and today's Large Language Models are demonstrating remarkable narrative capabilities based on advances in natural language processing. This situation raises a variety of changes and new issues, but a comprehensive discussion of them is hard to find. This paper aims to provide a holistic view of the current state and future changes by exploring the intersections and interactions of human and AI narrative intelligence. This paper begins with a review of multidisciplinary research on the intrinsic relationship between humans and narrative, represented by the term Homo narrans, and then provide a historical overview of how narrative has been studied in the field of AI. This paper then explore the possibilities and limitations of narrative intelligence as revealed by today's Large Language Models, and present three philosophical challenges for understanding the implications of AI with narrative intelligence.

Method of ChatBot Implementation Using Bot Framework (봇 프레임워크를 활용한 챗봇 구현 방안)

  • Kim, Ki-Young
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.15 no.1
    • /
    • pp.56-61
    • /
    • 2022
  • In this paper, we classify and present AI algorithms and natural language processing methods used in chatbots. A framework that can be used to implement a chatbot is also described. A chatbot is a system with a structure that interprets the input string by constructing the user interface in a conversational manner and selects an appropriate answer to the input string from the learned data and outputs it. However, training is required to generate an appropriate set of answers to a question and hardware with considerable computational power is required. Therefore, there is a limit to the practice of not only developing companies but also students learning AI development. Currently, chatbots are replacing the existing traditional tasks, and a practice course to understand and implement the system is required. RNN and Char-CNN are used to increase the accuracy of answering questions by learning unstructured data by applying technologies such as deep learning beyond the level of responding only to standardized data. In order to implement a chatbot, it is necessary to understand such a theory. In addition, the students presented examples of implementation of the entire system by utilizing the methods that can be used for coding education and the platform where existing developers and students can implement chatbots.

Can Online Community Managers Enhance User Engagement?: Evidence from Anonymous Social Media Postings (온라인 커뮤니티 이용자 참여 증진을 위한 관리자의 운영 전략: 대학별 대나무숲 분석을 중심으로)

  • Kim, Hyejeong;Hwang, Seungyeup;Kwak, Youshin;Choi, Jeonghye
    • Knowledge Management Research
    • /
    • v.23 no.2
    • /
    • pp.211-228
    • /
    • 2022
  • As social media marketing becomes prevalent, it is necessary to understand the administrative role of managers in promoting user engagement. However, little is known about how community managers enhance user engagement in social media. In this research, we study how managers can boost online user participation, including clicking likes and writing comments. Using the SUR (Seemingly Unrelated Regression) model, we find out that the active participation of managers increases user engagement of both passive (likes) and active (comments) ones. In addition, we find that the number of emotional words included in posts has a positive effect on the passive engagement whereas it negatively affects the active engagement. Lastly, the congruency between posts and comments positively affects users' passive engagement. This study contributes to prior literature related to online community management and text analyses. Furthermore, our findings offer managerial insights for practitioners and social media managers to further facilitate user engagement.