• Title/Summary/Keyword: AI Speaker

Search Result 74, Processing Time 0.025 seconds

The Effect of AI Agent's Multi Modal Interaction on the Driver Experience in the Semi-autonomous Driving Context : With a Focus on the Existence of Visual Character (반자율주행 맥락에서 AI 에이전트의 멀티모달 인터랙션이 운전자 경험에 미치는 효과 : 시각적 캐릭터 유무를 중심으로)

  • Suh, Min-soo;Hong, Seung-Hye;Lee, Jeong-Myeong
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.92-101
    • /
    • 2018
  • As the interactive AI speaker becomes popular, voice recognition is regarded as an important vehicle-driver interaction method in case of autonomous driving situation. The purpose of this study is to confirm whether multimodal interaction in which feedback is transmitted by auditory and visual mode of AI characters on screen is more effective in user experience optimization than auditory mode only. We performed the interaction tasks for the music selection and adjustment through the AI speaker while driving to the experiment participant and measured the information and system quality, presence, the perceived usefulness and ease of use, and the continuance intention. As a result of analysis, the multimodal effect of visual characters was not shown in most user experience factors, and the effect was not shown in the intention of continuous use. Rather, it was found that auditory single mode was more effective than multimodal in information quality factor. In the semi-autonomous driving stage, which requires driver 's cognitive effort, multimodal interaction is not effective in optimizing user experience as compared to single mode interaction.

A Design and Implementation of The Deep Learning-Based Senior Care Service Application Using AI Speaker

  • Mun Seop Yun;Sang Hyuk Yoon;Ki Won Lee;Se Hoon Kim;Min Woo Lee;Ho-Young Kwak;Won Joo Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.4
    • /
    • pp.23-30
    • /
    • 2024
  • In this paper, we propose a deep learning-based personalized senior care service application. The proposed application uses Speech to Text technology to convert the user's speech into text and uses it as input to Autogen, an interactive multi-agent large-scale language model developed by Microsoft, for user convenience. Autogen uses data from previous conversations between the senior and ChatBot to understand the other user's intent and respond to the response, and then uses a back-end agent to create a wish list, a shared calendar, and a greeting message with the other user's voice through a deep learning model for voice cloning. Additionally, the application can perform home IoT services with SKT's AI speaker (NUGU). The proposed application is expected to contribute to future AI-based senior care technology.

Differences in Perceptions of Usage and Intention to Continuous Use of AI Speakers: Focusing on Functions of Music, News, and Search (AI 스피커의 기능별 이용 인식과 지속 이용 의도의 차이: 음악, 뉴스, 검색을 중심으로)

  • Kim, Young Ju;Kim, Sung Tae;Kim, Hyoung-Jee
    • The Journal of the Korea Contents Association
    • /
    • v.20 no.11
    • /
    • pp.644-655
    • /
    • 2020
  • The study examined differences between perceptions of AI speakers and intention to continuous use of AI speakers according to usage function. We divided usage patterns into single- and multi-function orientations based on the usage by different functions of audio content (music, news, and search), and analyzed the differences between perceptions of using AI speakers and the intention to continuous use. 335 men and women who had experience using AI speakers participated in an online survey. Results are as follows. First, men used AI speakers mainly for acquiring news, and the extent to which 20s and 40s acquire news was different. Second, perceptions of usefulness and ease of use were found to be higher in the multi-functional group(music-news-search). Last, regarding the intention to continuous use of AI speakers, the multi-functional group was highest, and users focusing on music listening were relatively higher than users for other functions. The findings of the study are expected to be used as foundational data for expanding the use of AI speakers and developing strategies for service provision in each AI speaker brand.

A study on the usage intention of AI(artificial intelligence) speaker

  • Kwon, Soon-Hong;Lim, Yang-Whan;Kim, Hyun-Jeong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.25 no.1
    • /
    • pp.199-206
    • /
    • 2020
  • In this study, the factors affecting consumers' intention to use AI speakers were focused on the perceived value of the product and the perceived necessity of the product. Factors affectationist consumers' perceived value of the product were divided into benefits and costs. Reflecting the characteristics of information technology products, I included perceptions of usefulness of products. Empirical results show that consumers' perceptions of perceived benefits and usefulness of AI speaker products have a positive effect on perceived value and perceived necessity. Perception of necessity had a positive (+) significant effect on perception of value. Perception of necessity and perception of value had a positive(+) and positive effect on each intention of use. However, the cost perceived by consumers did not have a significant effect on perception of value.

Multimodal depression detection system based on attention mechanism using AI speaker (AI 스피커를 활용한 어텐션 메커니즘 기반 멀티모달 우울증 감지 시스템)

  • Park, Junhee;Moon, Nammee
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2021.06a
    • /
    • pp.28-31
    • /
    • 2021
  • 전세계적으로 우울증은 정신 건강 질환으로써 문제가 되고 있으며, 이를 해결하기 위해 일상생활에서의 우울증 탐지에 대한 연구가 진행되고 있다. 따라서 본 논문에서는 일상생활에 밀접하게 연관되어 있는 AI 스피커를 사용한 어텐션 메커니즘(Attention Mechanism) 기반 멀티모달 우울증 감지 시스템을 제안한다. 제안된 방법은 AI 스피커로부터 수집할 수 있는 음성 및 텍스트 데이터를 수집하고 CNN(Convolutional Neural Network)과 BiLSTM(Bidirectional Long Short-Term Memory Network)를 통해 각 데이터에서의 학습을 진행한다. 학습과정에서 Self-Attention 을 적용하여 특징 벡터에 추가적인 가중치를 부여하는 어텐션 메커니즘을 사용한다. 최종적으로 음성 및 텍스트 데이터에서 어텐션 가중치가 추가된 특징들을 합하여 SoftMax 를 통해 우울증 점수를 예측한다.

  • PDF

A Multi-speaker Speech Synthesis System Using X-vector (x-vector를 이용한 다화자 음성합성 시스템)

  • Jo, Min Su;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.7 no.4
    • /
    • pp.675-681
    • /
    • 2021
  • With the recent growth of the AI speaker market, the demand for speech synthesis technology that enables natural conversation with users is increasing. Therefore, there is a need for a multi-speaker speech synthesis system that can generate voices of various tones. In order to synthesize natural speech, it is required to train with a large-capacity. high-quality speech DB. However, it is very difficult in terms of recording time and cost to collect a high-quality, large-capacity speech database uttered by many speakers. Therefore, it is necessary to train the speech synthesis system using the speech DB of a very large number of speakers with a small amount of training data for each speaker, and a technique for naturally expressing the tone and rhyme of multiple speakers is required. In this paper, we propose a technology for constructing a speaker encoder by applying the deep learning-based x-vector technique used in speaker recognition technology, and synthesizing a new speaker's tone with a small amount of data through the speaker encoder. In the multi-speaker speech synthesis system, the module for synthesizing mel-spectrogram from input text is composed of Tacotron2, and the vocoder generating synthesized speech consists of WaveNet with mixture of logistic distributions applied. The x-vector extracted from the trained speaker embedding neural networks is added to Tacotron2 as an input to express the desired speaker's tone.

Construction Status and Proposal for Information Communication Facility of Childcare Center -After COVID19, focusing on IT Technology Utilization- (어린이집 정보통신설비 구축현황 및 제안 -COVID19 이후 IT기술활용 중심으로-)

  • Lee, Jae-Yong;Shin, Seung-Jung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.4
    • /
    • pp.43-50
    • /
    • 2020
  • The purpose of this study is to examine the case of constructing information and communication facilities in daycare centers and to propose an educational environment that can foster young talents who can lead the era of the fourth industrial revolution. In particular, after COVID19, a method was proposed to create an information and communication environment suitable for children to receive personalized education, and to create an environment for experiential education if possible, and at the same time to enable averaging of customized learning. Since there has been no research on information and communication facilities in daycare centers, we intend to place significance on starting, and in the future, to foster creative and contextual children, we will reduce the movement of teachers through smart speakers and mobile devices, and tailor the educational environment through AI data. I think that the design of the daycare center should be changed in the direction of making the product. To this end, the CM role of information and communication supervision is needed, and it is hoped that it will become a design standard for daycare centers after COVID19 by developing research on daycare centers.

T-commerce Trends and Development Model Proposal -Focusing on Broadcasting Screens and Customer Data Utilization- (T커머스 동향 및 발전모델 제안 -방송화면 및 고객데이터 활용중심-)

  • Lee, Jae-Yong;Shin, Seung-Jung
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.2
    • /
    • pp.49-54
    • /
    • 2021
  • The purpose of this study is to identify trends in T commerce and further propose ways to improve customer data-based services and development models for changes in broadcasting screens with the expansion of IPTV subscribers. Implementing a customized shopping model like mobile through TV media and improving customer satisfaction will reduce customer departures and provide a more convenient shopping environment through large screens. We would like to learn about the current status and problems of T commerce broadcasting and explain some technically validated models (channel-in-channel, AI speaker) and talk about improvement of legal (broadcasting and Internet multimedia business law) constraints.

Exploring user experience factors through generational online review analysis of AI speakers (인공지능 스피커의 세대별 온라인 리뷰 분석을 통한 사용자 경험 요인 탐색)

  • Park, Jeongeun;Yang, Dong-Uk;Kim, Ha-Young
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.7
    • /
    • pp.193-205
    • /
    • 2021
  • The AI speaker market is growing steadily. However, the satisfaction of actual users is only 42%. Therefore, in this paper, we collected reviews on Amazon Echo Dot 3rd and 4th generation models to analyze what hinders the user experience through the topic changes and emotional changes of each generation of AI speakers. By using topic modeling analysis techniques, we found changes in topics and topics that make up reviews for each generation, and examined how user sentiment on topics changed according to generation through deep learning-based sentiment analysis. As a result of topic modeling, five topics were derived for each generation. In the case of the 3rd generation, the topic representing general features of the speaker acted as a positive factor for the product, while user convenience features acted as negative factor. Conversely, in the 4th generation, general features were negatively, and convenience features were positively derived. This analysis is significant in that it can present analysis results that take into account not only lexical features but also contextual features of the entire sentence in terms of methodology.

User Experience Analysis and Management Based on Text Mining: A Smart Speaker Case (텍스트 마이닝 기반 사용자 경험 분석 및 관리: 스마트 스피커 사례)

  • Dine Yeon;Gayeon Park;Hee-Woong Kim
    • Information Systems Review
    • /
    • v.22 no.2
    • /
    • pp.77-99
    • /
    • 2020
  • Smart speaker is a device that provides an interactive voice-based service that can search and use various information and contents such as music, calendar, weather, and merchandise using artificial intelligence. Since AI technology provides more sophisticated and optimized services to users by accumulating data, early smart speaker manufacturers tried to build a platform through aggressive marketing. However, the frequency of using smart speakers is less than once a month, accounting for more than one third of the total, and user satisfaction is only 49%. Accordingly, the necessity of strengthening the user experience of smart speakers has emerged in order to acquire a large number of users and to enable continuous use. Therefore, this study analyzes the user experience of the smart speaker and proposes a method for enhancing the user experience of the smart speaker. Based on the analysis results in two stages, we propose ways to enhance the user experience of smart speakers by model. The existing research on the user experience of the smart speaker was mainly conducted by survey and interview-based research, whereas this study collected the actual review data written by the user. Also, this study interpreted the analysis result based on the smart speaker user experience dimension. There is an academic significance in interpreting the text mining results by developing the smart speaker user experience dimension. Based on the results of this study, we can suggest strategies for enhancing the user experience to smart speaker manufacturers.