• Title/Summary/Keyword: voice recognition

Search Result 657, Processing Time 0.03 seconds

Analysis of Approachs to Learning Based on Student-Student Verbal Interactions according to the Type of Inquiry Experiments Using Everyday Materials (실생활 소재 탐구 실험 형태에 따른 학생-학생 언어적 상호작용에서의 학습 접근 수준 분석)

  • Kim, Hye-Sim;Lee, Eun-Kyeong;Kang, Seong-Joo
    • Journal of The Korean Association For Science Education
    • /
    • v.26 no.1
    • /
    • pp.16-24
    • /
    • 2006
  • The purpose of this study was to compare student-student verbal interaction from two type's experiments; problem-solving and task-solving. For this study, five 3rd grade middle school students were selected and their verbal interactions recorded via voice and video; and later transcribed. The student-student verbal interactions were classified as questions, explanations, thoughts, or metacognition fields, which were separated into deep versus surface learning approaches. For the problem-solving experiment, findings revealed that the number of verbal interactions is more than doubled and in particular, the number of verbal interactions using deep-approach is more than quadrupled from the point of problem-recognition to problem-solution. As for the task-solving experiment, findings showed that verbal interactions remained evenly distributed throughout the entire experiment. Finally, it was also discovered that students relied upon a more deep learning approach during the problem-solving experiment than the task-solving experiment.

Applying Social Strategies for Breakdown Situations of Conversational Agents: A Case Study using Forewarning and Apology (대화형 에이전트의 오류 상황에서 사회적 전략 적용: 사전 양해와 사과를 이용한 사례 연구)

  • Lee, Yoomi;Park, Sunjeong;Suk, Hyeon-Jeong
    • Science of Emotion and Sensibility
    • /
    • v.21 no.1
    • /
    • pp.59-70
    • /
    • 2018
  • With the breakthrough of speech recognition technology, conversational agents have become pervasive through smartphones and smart speakers. The recognition accuracy of speech recognition technology has developed to the level of human beings, but it still shows limitations on understanding the underlying meaning or intention of words, or understanding long conversation. Accordingly, the users experience various errors when interacting with the conversational agents, which may negatively affect the user experience. In addition, in the case of smart speakers with a voice as the main interface, the lack of feedback on system and transparency was reported as the main issue when the users using. Therefore, there is a strong need for research on how users can better understand the capability of the conversational agents and mitigate negative emotions in error situations. In this study, we applied social strategies, "forewarning" and "apology", to conversational agent and investigated how these strategies affect users' perceptions of the agent in breakdown situations. For the study, we created a series of demo videos of a user interacting with a conversational agent. After watching the demo videos, the participants were asked to evaluate how they liked and trusted the agent through an online survey. A total of 104 respondents were analyzed and found to be contrary to our expectation based on the literature study. The result showed that forewarning gave a negative impression to the user, especially the reliability of the agent. Also, apology in a breakdown situation did not affect the users' perceptions. In the following in-depth interviews, participants explained that they perceived the smart speaker as a machine rather than a human-like object, and for this reason, the social strategies did not work. These results show that the social strategies should be applied according to the perceptions that user has toward agents.

The Audience Behavior-based Emotion Prediction Model for Personalized Service (고객 맞춤형 서비스를 위한 관객 행동 기반 감정예측모형)

  • Ryoo, Eun Chung;Ahn, Hyunchul;Kim, Jae Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.19 no.2
    • /
    • pp.73-85
    • /
    • 2013
  • Nowadays, in today's information society, the importance of the knowledge service using the information to creative value is getting higher day by day. In addition, depending on the development of IT technology, it is ease to collect and use information. Also, many companies actively use customer information to marketing in a variety of industries. Into the 21st century, companies have been actively using the culture arts to manage corporate image and marketing closely linked to their commercial interests. But, it is difficult that companies attract or maintain consumer's interest through their technology. For that reason, it is trend to perform cultural activities for tool of differentiation over many firms. Many firms used the customer's experience to new marketing strategy in order to effectively respond to competitive market. Accordingly, it is emerging rapidly that the necessity of personalized service to provide a new experience for people based on the personal profile information that contains the characteristics of the individual. Like this, personalized service using customer's individual profile information such as language, symbols, behavior, and emotions is very important today. Through this, we will be able to judge interaction between people and content and to maximize customer's experience and satisfaction. There are various relative works provide customer-centered service. Specially, emotion recognition research is emerging recently. Existing researches experienced emotion recognition using mostly bio-signal. Most of researches are voice and face studies that have great emotional changes. However, there are several difficulties to predict people's emotion caused by limitation of equipment and service environments. So, in this paper, we develop emotion prediction model based on vision-based interface to overcome existing limitations. Emotion recognition research based on people's gesture and posture has been processed by several researchers. This paper developed a model that recognizes people's emotional states through body gesture and posture using difference image method. And we found optimization validation model for four kinds of emotions' prediction. A proposed model purposed to automatically determine and predict 4 human emotions (Sadness, Surprise, Joy, and Disgust). To build up the model, event booth was installed in the KOCCA's lobby and we provided some proper stimulative movie to collect their body gesture and posture as the change of emotions. And then, we extracted body movements using difference image method. And we revised people data to build proposed model through neural network. The proposed model for emotion prediction used 3 type time-frame sets (20 frames, 30 frames, and 40 frames). And then, we adopted the model which has best performance compared with other models.' Before build three kinds of models, the entire 97 data set were divided into three data sets of learning, test, and validation set. The proposed model for emotion prediction was constructed using artificial neural network. In this paper, we used the back-propagation algorithm as a learning method, and set learning rate to 10%, momentum rate to 10%. The sigmoid function was used as the transform function. And we designed a three-layer perceptron neural network with one hidden layer and four output nodes. Based on the test data set, the learning for this research model was stopped when it reaches 50000 after reaching the minimum error in order to explore the point of learning. We finally processed each model's accuracy and found best model to predict each emotions. The result showed prediction accuracy 100% from sadness, and 96% from joy prediction in 20 frames set model. And 88% from surprise, and 98% from disgust in 30 frames set model. The findings of our research are expected to be useful to provide effective algorithm for personalized service in various industries such as advertisement, exhibition, performance, etc.

Research on Generative AI for Korean Multi-Modal Montage App (한국형 멀티모달 몽타주 앱을 위한 생성형 AI 연구)

  • Lim, Jeounghyun;Cha, Kyung-Ae;Koh, Jaepil;Hong, Won-Kee
    • Journal of Service Research and Studies
    • /
    • v.14 no.1
    • /
    • pp.13-26
    • /
    • 2024
  • Multi-modal generation is the process of generating results based on a variety of information, such as text, images, and audio. With the rapid development of AI technology, there is a growing number of multi-modal based systems that synthesize different types of data to produce results. In this paper, we present an AI system that uses speech and text recognition to describe a person and generate a montage image. While the existing montage generation technology is based on the appearance of Westerners, the montage generation system developed in this paper learns a model based on Korean facial features. Therefore, it is possible to create more accurate and effective Korean montage images based on multi-modal voice and text specific to Korean. Since the developed montage generation app can be utilized as a draft montage, it can dramatically reduce the manual labor of existing montage production personnel. For this purpose, we utilized persona-based virtual person montage data provided by the AI-Hub of the National Information Society Agency. AI-Hub is an AI integration platform aimed at providing a one-stop service by building artificial intelligence learning data necessary for the development of AI technology and services. The image generation system was implemented using VQGAN, a deep learning model used to generate high-resolution images, and the KoDALLE model, a Korean-based image generation model. It can be confirmed that the learned AI model creates a montage image of a face that is very similar to what was described using voice and text. To verify the practicality of the developed montage generation app, 10 testers used it and more than 70% responded that they were satisfied. The montage generator can be used in various fields, such as criminal detection, to describe and image facial features.

An User Experience Analysis of Virtual Assistant Using Grounded Theory - Focused on SKT Virtual Personal Assistant 'NUGU' - (근거 이론을 적용한 가상 비서의 사용자 경험 분석 - SKT 가상 비서 'NUGU'를 중심으로 -)

  • Hwang, Seung Hee;Yun, Ray Jaeyoung
    • Journal of the HCI Society of Korea
    • /
    • v.12 no.2
    • /
    • pp.31-40
    • /
    • 2017
  • This a qualitative research about the virtual personal assistant, voice recognition device SKT 'NUGU' which was launched on September 1, 2016. For the study, an in-depth interview was committed with the 9 research participants who had used this device for more than a month. For the result of the interview, 362 concepts were discovered and through open coding, axis coding, selective coding the concepts got categorized in 16 sub-categories and 10 top categories. After recognizing 362 concepts from the interview sources, I proposed a paradigm model from the open coding. And from the selective coding, the main category of the study has been narrowed down to understand the 'Usage Patterns by Each Type'. As a result of the typification, it was confirmed that the usage pattern can be described in two different types of the dependent and inquiry type. From the result of the research, it provided the basic data about the user experience of virtual assistant which can be utilized when suggesting virtual personal assistant in the near future.

Good Government, I want to Live in there : Using the Q-methodology (좋은 국가, 그곳에서 살고 싶다! : Q방법론을 활용하여)

  • Lee, Doh-Hee;Yu, Young-Seol
    • The Journal of the Korea Contents Association
    • /
    • v.17 no.12
    • /
    • pp.545-557
    • /
    • 2017
  • In this study, as Korea's new government began, the voice of the so-called "real country" has increased, and what has been said about the "good country" and "good government". Therefore, based on the statements and researches on 'good country' and 'good government' that we talk about in our daily life, we collected associative statements about 'good country'. As a result of analysis, we classified the 'good countries' into 5 types as follows. is named "Trusted State Type" and is named "Workable State Type". is called "national type for children," is named "happy national type," and is named "living type." According to the results of the analysis, there is no significant difference in recognition of good countries according to age and occupation. In the twenty-first century, the Republic of Korea is forced to rethink its understanding of our "state" and the reason for its existence as a new government is introduced in a period of turbulence called regime change. This study intends to give meaning to the meaning of existence of 'Government' and opportunity to recall the desire and expectation of 'Good Government'

A Study on Improving of Access to School Library Collection through High School Students' DLS Search Behavior Analysis (고등학생의 DLS 검색행태 분석을 통한 학교도서관 자료 접근성 향상 방안 고찰)

  • Jung, Youngmi;Kang, Bong-Suk
    • Journal of Korean Library and Information Science Society
    • /
    • v.51 no.2
    • /
    • pp.355-379
    • /
    • 2020
  • Digital Library System(DLS) for the school library is a key access tool for school library materials. The purpose of this study was to find ways to improve the accessibility of materials through analysis of students' information search behavior in DLS. Data were collected through recording of 42 participants' DLS search process, and questionnaire. As a result, the search success rate and search satisfaction were found to be lower when the main purpose of DLS is simple leisure reading, information needs are relatively ambiguous, and when user experiences the complicated situations in the search process. The satisfaction level of search time sufficiency was the highest, and the search result satisfaction was the lowest. Besides, there was a need to improve DLS, such as integrated search of other library collection information, the recommendation of related materials, the print output of collection location, voice recognition through mobile apps, and automatic correction of search errors. Through this, the following can be suggested. First, DLS should complement the function of providing career information by reflecting the demand of education consumers. Second, improvements to DLS functionality to the general information retrieval system level must be made. Third, an infrastructure must be established for close cooperation between school library field personnel and DLS management authorities.

RPCA-GMM for Speaker Identification (화자식별을 위한 강인한 주성분 분석 가우시안 혼합 모델)

  • 이윤정;서창우;강상기;이기용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.519-527
    • /
    • 2003
  • Speech is much influenced by the existence of outliers which are introduced by such an unexpected happenings as additive background noise, change of speaker's utterance pattern and voice detection errors. These kinds of outliers may result in severe degradation of speaker recognition performance. In this paper, we proposed the GMM based on robust principal component analysis (RPCA-GMM) using M-estimation to solve the problems of both ouliers and high dimensionality of training feature vectors in speaker identification. Firstly, a new feature vector with reduced dimension is obtained by robust PCA obtained from M-estimation. The robust PCA transforms the original dimensional feature vector onto the reduced dimensional linear subspace that is spanned by the leading eigenvectors of the covariance matrix of feature vector. Secondly, the GMM with diagonal covariance matrix is obtained from these transformed feature vectors. We peformed speaker identification experiments to show the effectiveness of the proposed method. We compared the proposed method (RPCA-GMM) with transformed feature vectors to the PCA and the conventional GMM with diagonal matrix. Whenever the portion of outliers increases by every 2%, the proposed method maintains almost same speaker identification rate with 0.03% of little degradation, while the conventional GMM and the PCA shows much degradation of that by 0.65% and 0.55%, respectively This means that our method is more robust to the existence of outlier.

Ubiquitous u-Health System using RFID & ZigBee (RFID와 ZigBee를 이용한 유비쿼터스 u-Health 시스템 구현)

  • Kim Jin-Tai;Kwon Youngmi
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.43 no.1 s.343
    • /
    • pp.79-88
    • /
    • 2006
  • In this paper, we designed and implemented ubiquitous u-Health system using RFE and ZigBee. We made a wireless protocol Kit which combines RFE Tag recognition and ZigBee data communication capability. The software is designed and developed on the TinyOS. Wireless communication technologies which hold multi-protocol stacks with RFID and result in the wireless ubiquitous world could be Bluetooth, ZigBee, 802.11x WLAN and so on. The environments that the suggested u-Health system may be used is un-manned nursing, which would be utilized in dense sensor networks such as a hospital. The the size of devices with RFID and ZigBee will be so smaller and smaller as a bracelet, a wrist watch and a ring. The combined wireless RFID-ZigBee system could be applied to applications which requires some actions corresponding to the collected (or sensed) information in WBAN(Wireless Body Area Network) and/or WPAN(Wireless Person Area Network). The proposed ubiquitous u-Health system displays some text-type alert message on LCD which is attached to the system or gives voice alert message to the adequate node users. RFE will be used as various combinations with other wireless technologies for some application-specific purposes.

Development of a Portable Card Reader for the Visually Impaired using Raspberry Pi (라즈베리 파이를 적용한 시각장애인을 위한 휴대용 카드 리더기 개발)

  • Lee, Hyun-Seung;Choi, In-Moon;Lim, Soon-Ja
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.18 no.10
    • /
    • pp.131-135
    • /
    • 2017
  • We developed a portable card reader for the visually impaired. In South Korea, there is insufficient development of lifestyle aids for people with disabilities. Living aids for people with disabilities are being developed using information technology, smart phones, Internet of Things(IoT) devices, 3D printers, and so on. Blind people were interviewed, which showed that the card recognition function using a currently developed smart phone app was not able to recognize the screen of the smart phone by the hand of the visually impaired, and it was inconvenient to operate. In recent years, devices that enable the visually impaired to recognize cards have been studied in foreign countries and are emerging prototypes. But what is currently available is expensive and inconvenient. In addition, visually impaired people are most vulnerable to low-income families, which makes it difficult to purchase and use expensive devices. In this study, we developed a card reader that recognizes a card using a Raspberry Pi, which is an open-source hardware that can be applied to IoT. The card reader plays it by voice and vibration, and the visually impaired can use it at a low price.