• Title/Summary/Keyword: 대화형 인공지능

Search Result 84, Processing Time 0.03 seconds

Building Open Domain Chatbot based Language Model (언어모델 기반 오픈 도메인 챗봇 구현)

  • Kim, Seung-Tae;Koo, Jahwan;Kim, Ung-Mo
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.11a
    • /
    • pp.931-933
    • /
    • 2020
  • 자연어 처리는 인공지능의 핵심기술 중 하나이다. 그 중 오픈 도메인 챗봇 구현은 NLP 에서 어려운 태스크로 꼽힌다. 명확한 목표, FAQ 가 존재하는 기능형 챗봇과 달리 오픈 도메인 챗봇은 연속적 대화, 방대한 양의 상식 등 구현에 어려움이 많았다. 짧은 질문과 대답으로 이루어진 데이터로 학습한 모델을 대화 데이터로 학습시켜 좀더 자연스러운 챗봇을 구현해보고자 한다.

Identifying Social Relationships using Text Analysis for Social Chatbots (소셜챗봇 구축에 필요한 관계성 추론을 위한 텍스트마이닝 방법)

  • Kim, Jeonghun;Kwon, Ohbyung
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.85-110
    • /
    • 2018
  • A chatbot is an interactive assistant that utilizes many communication modes: voice, images, video, or text. It is an artificial intelligence-based application that responds to users' needs or solves problems during user-friendly conversation. However, the current version of the chatbot is focused on understanding and performing tasks requested by the user; its ability to generate personalized conversation suitable for relationship-building is limited. Recognizing the need to build a relationship and making suitable conversation is more important for social chatbots who require social skills similar to those of problem-solving chatbots like the intelligent personal assistant. The purpose of this study is to propose a text analysis method that evaluates relationships between chatbots and users based on content input by the user and adapted to the communication situation, enabling the chatbot to conduct suitable conversations. To evaluate the performance of this method, we examined learning and verified the results using actual SNS conversation records. The results of the analysis will aid in implementation of the social chatbot, as this method yields excellent results even when the private profile information of the user is excluded for privacy reasons.

Developing a New Algorithm for Conversational Agent to Detect Recognition Error and Neologism Meaning: Utilizing Korean Syllable-based Word Similarity (대화형 에이전트 인식오류 및 신조어 탐지를 위한 알고리즘 개발: 한글 음절 분리 기반의 단어 유사도 활용)

  • Jung-Won Lee;Il Im
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.3
    • /
    • pp.267-286
    • /
    • 2023
  • The conversational agents such as AI speakers utilize voice conversation for human-computer interaction. Voice recognition errors often occur in conversational situations. Recognition errors in user utterance records can be categorized into two types. The first type is misrecognition errors, where the agent fails to recognize the user's speech entirely. The second type is misinterpretation errors, where the user's speech is recognized and services are provided, but the interpretation differs from the user's intention. Among these, misinterpretation errors require separate error detection as they are recorded as successful service interactions. In this study, various text separation methods were applied to detect misinterpretation. For each of these text separation methods, the similarity of consecutive speech pairs using word embedding and document embedding techniques, which convert words and documents into vectors. This approach goes beyond simple word-based similarity calculation to explore a new method for detecting misinterpretation errors. The research method involved utilizing real user utterance records to train and develop a detection model by applying patterns of misinterpretation error causes. The results revealed that the most significant analysis result was obtained through initial consonant extraction for detecting misinterpretation errors caused by the use of unregistered neologisms. Through comparison with other separation methods, different error types could be observed. This study has two main implications. First, for misinterpretation errors that are difficult to detect due to lack of recognition, the study proposed diverse text separation methods and found a novel method that improved performance remarkably. Second, if this is applied to conversational agents or voice recognition services requiring neologism detection, patterns of errors occurring from the voice recognition stage can be specified. The study proposed and verified that even if not categorized as errors, services can be provided according to user-desired results.

A Korean menu-ordering sentence text-to-speech system using conformer-based FastSpeech2 (콘포머 기반 FastSpeech2를 이용한 한국어 음식 주문 문장 음성합성기)

  • Choi, Yerin;Jang, JaeHoo;Koo, Myoung-Wan
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.3
    • /
    • pp.359-366
    • /
    • 2022
  • In this paper, we present the Korean menu-ordering Sentence Text-to-Speech (TTS) system using conformer-based FastSpeech2. Conformer is the convolution-augmented transformer, which was originally proposed in Speech Recognition. Combining two different structures, the Conformer extracts better local and global features. It comprises two half Feed Forward module at the front and the end, sandwiching the Multi-Head Self-Attention module and Convolution module. We introduce the Conformer in Korean TTS, as we know it works well in Korean Speech Recognition. For comparison between transformer-based TTS model and Conformer-based one, we train FastSpeech2 and Conformer-based FastSpeech2. We collected a phoneme-balanced data set and used this for training our models. This corpus comprises not only general conversation, but also menu-ordering conversation consisting mainly of loanwords. This data set is the solution to the current Korean TTS model's degradation in loanwords. As a result of generating a synthesized sound using ParallelWave Gan, the Conformer-based FastSpeech2 achieved superior performance of MOS 4.04. We confirm that the model performance improved when the same structure was changed from transformer to Conformer in the Korean TTS.

Analysis and Design of Social-Robot System based on IoT (사물인터넷 기반 소셜로봇 시스템의 분석 및 설계)

  • Cho, Byung-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.19 no.1
    • /
    • pp.179-185
    • /
    • 2019
  • A core technology of social robot is voice recognition and dialogue engine technology, but too much money is needed for development and an implementation of robot's conversation function is difficult resulting from insufficiency of performance. Dialogue function's implementation between human and robot can be possible due to advance of cloud AI technology and several company's supply of their open API. In this paper, current intelligent social robot technology trend is investigated and effective social robot system architecture is designed. Also an effective analysis and design method of social robot system will be presented by showing user requirement analysis using object-oriented method, flowchart and screen design.

Artificial Intelligence for Assistance of Facial Expression Practice Using Emotion Classification (감정 분류를 이용한 표정 연습 보조 인공지능)

  • Dong-Kyu, Kim;So Hwa, Lee;Jae Hwan, Bong
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.17 no.6
    • /
    • pp.1137-1144
    • /
    • 2022
  • In this study, an artificial intelligence(AI) was developed to help with facial expression practice in order to express emotions. The developed AI used multimodal inputs consisting of sentences and facial images for deep neural networks (DNNs). The DNNs calculated similarities between the emotions predicted by the sentences and the emotions predicted by facial images. The user practiced facial expressions based on the situation given by sentences, and the AI provided the user with numerical feedback based on the similarity between the emotion predicted by sentence and the emotion predicted by facial expression. ResNet34 structure was trained on FER2013 public data to predict emotions from facial images. To predict emotions in sentences, KoBERT model was trained in transfer learning manner using the conversational speech dataset for emotion classification opened to the public by AIHub. The DNN that predicts emotions from the facial images demonstrated 65% accuracy, which is comparable to human emotional classification ability. The DNN that predicts emotions from the sentences achieved 90% accuracy. The performance of the developed AI was evaluated through experiments with changing facial expressions in which an ordinary person was participated.

Research on art contents based on 4th industrial technology -Focusing on artificial intelligence painting and NFT art- (4차 산업 기술 기반의 예술 콘텐츠 연구 -인공지능 회화와 NFT 미술을 중심으로-)

  • Bang Jinwon
    • The Journal of the Convergence on Culture Technology
    • /
    • v.10 no.4
    • /
    • pp.613-625
    • /
    • 2024
  • This study analyzed the convergence case of AI painting and NFT art, art content created based on digital technology, an innovative technology of the 4th industrial technology, and explored its characteristics. Digital technology that innovates the paradigm of life in the 21st century is being used in creative art, and AI painting and NFT art that use it as an expression tool are changing the way they perceive and accept art. AI painting using big data and artificial intelligence technology is evolving into interactive daily art, and NFT art using blockchain and NFT technology is becoming the art of the metaverse with economic and cultural values. Therefore, this study attempted to explore various aspects and values of these digital convergence arts. For the study, representative examples of AI painting and NFT art were classified into cognitive creative AI painting and language generative AI, art economic NFTs, and art and cultural NFTs, and their characteristics, contents, and meanings were analyzed. It is hoped that the results of this study will contribute to the development of AI painting and NFT art, which are digital convergence arts.

A Study on Immersive Content Production and Storytelling Methods using Photogrammetry and Artificial Intelligence Technology (포토그래메트리 및 인공지능 기술을 활용한 실감 콘텐츠 제작과 스토리텔링 방법 연구)

  • Kim, Jungho;Park, JinWan;Yoo, Taekyung
    • Journal of Broadcast Engineering
    • /
    • v.27 no.5
    • /
    • pp.654-664
    • /
    • 2022
  • Immersive content overcomes spatial limitations through convergence with extended reality, artificial intelligence, and photogrammetry technology along with interest due to the COVID-19 pandemic, presenting a new paradigm in the content market such as entertainment, media, performances, and exhibitions. However, it can be seen that in order for realistic content to have sustained public interest, it is necessary to study storytelling method that can increase immersion in content rather than technological freshness. Therefore, in this study, we propose a immersive content storytelling method using artificial intelligence and photogrammetry technology. The proposed storytelling method is to create a content story through interaction between interactive virtual beings and participants. In this way, participation can increase content immersion. This study is expected to help content creators in the accelerating immersive content market with a storytelling methodology through virtual existence that utilizes artificial intelligence technology proposed to content creators to help in efficient content creation. In addition, I think that it will contribute to the establishment of a immersive content production pipeline using artificial intelligence and photogrammetry technology in content production.

The Effect of Interjection in Conversational Interaction with the AI Agent: In the Context of Self-Driving Car (인공지능 에이전트 대화형 인터랙션에서의 감탄사 효과: 자율주행 맥락에서)

  • Lee, Sooji;Seo, Jeeyoon;Choi, Junho
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.1
    • /
    • pp.551-563
    • /
    • 2022
  • This study aims to identify the effect on the user experiences when the embodied agent in a self-driving car interacts with emotional expressions by using 'interjection'. An experimental study was designed with two conditions: the inclusion of injections in the agent's conversation feedbacks (with interjections vs. without interjections) and the type of conversation (task-oriented conversation vs. social-oriented conversation). The online experiment was conducted with the four video clips of conversation scenario treatments and measured intimacy, likability, trust, social presence, perceived anthropomorphism, and future intention to use. The result showed that when the agent used interjection, the main effect on social presence was found in both conversation types. When the agent did not use interjection in the task-oriented conversation, trust and future intention to use were higher than when the agent talked with emotional expressions. In the context of the conversation with the AI agent in a self-driving car, we found only the effect of adding emotional expression by using interjection on the enhancing social presence, but no effect on the other user experience factors.

Development of an interactive smart cooking service system using behavior and voice recognition (행동 및 음성인식 기술을 이용한 대화형 스마트 쿠킹 서비스 시스템 개발)

  • Moon, Yu-Gyeong;Kim, Ga-Yeon;Kim, Yoo-Ha;Park, Min-Ji;Seo, Min-Hyuk;Nah, Jeong-Eun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.11a
    • /
    • pp.1128-1131
    • /
    • 2021
  • COVID-19로 인한 홈 쿠킹 시장 수요 증가로 사람들은 더 편리한 요리 보조 시스템을 필요로 하고 있다. 기존 요리 시스템은 휴대폰, 책을 통해 레시피를 일방적으로 제공하기 때문에 사용자가 요리과정을 중단하고 반복적으로 열람해야 한다는 한계점을 가진다. '대화형 스마트 쿠킹 서비스' 시스템은 요리 과정 전반에서 필요한 내용을 사용자와 상호작용하며 적절하게 인지하고 알려주는 인공지능 시스템이다. Google의 MediaPipe를 사용해 사용자의 관절을 인식하고 모델을 학습시켜 사용자의 요리 동작을 인식하도록 설계했으며, dialogflow를 이용한 챗봇 기능을 통해 필요한 재료, 다음 단계 등의 내용을 실시간으로 제시한다. 또한 실시간 행동 인식으로 요리과정 중 화재, 베임 사고 등의 위험 상황을 감지하여 사용자에게 정보를 전달해줌으로써 사고를 예방한다. 음성인식을 통해 시스템과 사용자 간의 쌍방향적 소통을 가능하게 했고, 음성으로 화면을 제어함으로써 요리과정에서의 불필요한 디스플레이 터치를 방지해 위생적인 요리 환경을 제공한다.