• Title/Summary/Keyword: Multi-Modal Recognition

Search Result 68, Processing Time 0.025 seconds

NUI/NUX framework based on intuitive hand motion (직관적인 핸드 모션에 기반한 NUI/NUX 프레임워크)

  • Lee, Gwanghyung;Shin, Dongkyoo;Shin, Dongil
    • Journal of Internet Computing and Services
    • /
    • v.15 no.3
    • /
    • pp.11-19
    • /
    • 2014
  • The natural user interface/experience (NUI/NUX) is used for the natural motion interface without using device or tool such as mice, keyboards, pens and markers. Up to now, typical motion recognition methods used markers to receive coordinate input values of each marker as relative data and to store each coordinate value into the database. But, to recognize accurate motion, more markers are needed and much time is taken in attaching makers and processing the data. Also, as NUI/NUX framework being developed except for the most important intuition, problems for use arise and are forced for users to learn many NUI/NUX framework usages. To compensate for this problem in this paper, we didn't use markers and implemented for anyone to handle it. Also, we designed multi-modal NUI/NUX framework controlling voice, body motion, and facial expression simultaneously, and proposed a new algorithm of mouse operation by recognizing intuitive hand gesture and mapping it on the monitor. We implement it for user to handle the "hand mouse" operation easily and intuitively.

Smart Affect Jewelry based on Multi-modal (멀티 모달 기반의 스마트 감성 주얼리)

  • Kang, Yun-Jeong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.7
    • /
    • pp.1317-1324
    • /
    • 2016
  • Utilizing the Arduino platform to express the emotions that reflect the colors expressed the jewelry. Emotional color expression utilizes Plutchik's Wheel of Emotions model was applied to the similarity of emotions and colors. It receives the recognized value from the temperature, lighting, sound, pulse sensor and gyro sensor of a smart jewelery that can be easily accessible from your smartphone processes that recognize and process the emotion applied the rules of inference based on ontology. The emotional feelings color depending on the color looking for the emotion seen in context and applied to the smart LED jewelry. The emotion and the color combination of contextual information extracted from the recognition sensors are reflected in the built-in smart LED Jewelry depending on the emotions of the wearer. Take a light plus the emotion in a smart jewelery can represent the emotions of the situation, the doctor will be able to be a tool of representation.

The Effect of AI Agent's Multi Modal Interaction on the Driver Experience in the Semi-autonomous Driving Context : With a Focus on the Existence of Visual Character (반자율주행 맥락에서 AI 에이전트의 멀티모달 인터랙션이 운전자 경험에 미치는 효과 : 시각적 캐릭터 유무를 중심으로)

  • Suh, Min-soo;Hong, Seung-Hye;Lee, Jeong-Myeong
    • The Journal of the Korea Contents Association
    • /
    • v.18 no.8
    • /
    • pp.92-101
    • /
    • 2018
  • As the interactive AI speaker becomes popular, voice recognition is regarded as an important vehicle-driver interaction method in case of autonomous driving situation. The purpose of this study is to confirm whether multimodal interaction in which feedback is transmitted by auditory and visual mode of AI characters on screen is more effective in user experience optimization than auditory mode only. We performed the interaction tasks for the music selection and adjustment through the AI speaker while driving to the experiment participant and measured the information and system quality, presence, the perceived usefulness and ease of use, and the continuance intention. As a result of analysis, the multimodal effect of visual characters was not shown in most user experience factors, and the effect was not shown in the intention of continuous use. Rather, it was found that auditory single mode was more effective than multimodal in information quality factor. In the semi-autonomous driving stage, which requires driver 's cognitive effort, multimodal interaction is not effective in optimizing user experience as compared to single mode interaction.

Damage Detection of Bridge Structures Considering Uncertainty in Analysis Model (해석모델의 불확실성을 고려한 교량의 손상추정기법)

  • Lee Jong-Jae;Yun Chung-Bang
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.19 no.2 s.72
    • /
    • pp.125-138
    • /
    • 2006
  • The use of system identification approaches for damage detection has been expanded in recent years owing to the advancements in data acquisition system andinformation processing techniques. Soft computing techniques such as neural networks and genetic algorithm have been utilized increasingly for this end due to their excellent pattern recognition capability. In this study, damage detection of bridge structures using neural networks technique based on the modal properties is presented, which can effectively consider the modeling uncertainty in the analysis model from which the training patterns are to be generated. The differences or the ratios of the mode shape components between before and after damage are used as the input to the neural networks in this method, since they are found to be less sensitive to the modeling errors than the mode shapes themselves. Two numerical example analyses on a simple beam and a multi-girder bridge are presented to demonstrate the effectiveness and applicability of the proposed method.

Contents Development of IrobiQ on School Violence Prevention Program for Young Children (지능형 로봇 아이로비큐(IrobiQ)를 활용한 학교폭력 예방 프로그램 개발)

  • Hyun, Eunja;Lee, Hawon;Yeon, Hyemin
    • The Journal of the Korea Contents Association
    • /
    • v.13 no.9
    • /
    • pp.455-466
    • /
    • 2013
  • The purpose of this study was to develop a school violence prevention program "Modujikimi" for young children to be embedded in IrobiQ, the teacher assistive robot. The themes of this program consisted of basic character education, bullying prevention education and sexual violence prevention education. The activity types included large group, individual and small group activities, free choice activities, and finally parents' education, which included poems, fairy tales, music, art, sharing stories. Finally, the multi modal functions of the robot were employed: image on the screen, TTS (Text To Speech), touch function, recognition of sound and recording system. The robot content was demonstrated to thirty early childhood educators whose acceptability of the content was measured using questionnaires. And also the content was applied to children in daycare center. As a result, majority of them responded positively in acceptability. The results of this study suggest that the further research is needed to improve two-way interactivity of teacher assistive robot.

Social Network Analysis of TV Drama via Location Knowledge-learned Deep Hypernetworks (장소 정보를 학습한 딥하이퍼넷 기반 TV드라마 소셜 네트워크 분석)

  • Nan, Chang-Jun;Kim, Kyung-Min;Zhang, Byoung-Tak
    • KIISE Transactions on Computing Practices
    • /
    • v.22 no.11
    • /
    • pp.619-624
    • /
    • 2016
  • Social-aware video displays not only the relationships between characters but also diverse information on topics such as economics, politics and culture as a story unfolds. Particularly, the speaking habits and behavioral patterns of people in different situations are very important for the analysis of social relationships. However, when dealing with this dynamic multi-modal data, it is difficult for a computer to analyze the drama data effectively. To solve this problem, previous studies employed the deep concept hierarchy (DCH) model to automatically construct and analyze social networks in a TV drama. Nevertheless, since location knowledge was not included, they can only analyze the social network as a whole in stories. In this research, we include location knowledge and analyze the social relations in different locations. We adopt data from approximately 4400 minutes of a TV drama Friends as our dataset. We process face recognition on the characters by using a convolutional- recursive neural networks model and utilize a bag of features model to classify scenes. Then, in different scenes, we establish the social network between the characters by using a deep concept hierarchy model and analyze the change in the social network while the stories unfold.

A Conversational Interactive Tactile Map for the Visually Impaired (시각장애인의 길 탐색을 위한 대화형 인터랙티브 촉각 지도 개발)

  • Lee, Yerin;Lee, Dongmyeong;Quero, Luis Cavazos;Bartolome, Jorge Iranzo;Cho, Jundong;Lee, Sangwon
    • Science of Emotion and Sensibility
    • /
    • v.23 no.1
    • /
    • pp.29-40
    • /
    • 2020
  • Visually impaired people use tactile maps to get spatial information about their surrounding environment, find their way, and improve their independent mobility. However, classical tactile maps that make use of braille to describe the location within the map have several limitations, such as the lack of information due to constraints on space and limited feedback possibilities. This study describes the development of a new multi-modal interactive tactile map interface that addresses the challenges of tactile maps to improve the usability and independence of visually impaired people when using tactile maps. This interface adds touch gesture recognition to the surface of tactile maps and enables the users to verbally interact with a voice agent to receive feedback and information about navigation routes and points of interest. A low-cost prototype was developed to conduct usability tests that evaluated the interface through a survey and interview given to blind participants after using the prototype. The test results show that this interactive tactile map prototype provides improved usability for people over traditional tactile maps that use braille only. Participants reported that it was easier to find the starting point and points of interest they wished to navigate to with the prototype. Also, it improved self-reported independence and confidence compared with traditional tactile maps. Future work includes further development of the mobility solution based on the feedback received and an extensive quantitative study.

Anomaly Detection Methodology Based on Multimodal Deep Learning (멀티모달 딥 러닝 기반 이상 상황 탐지 방법론)

  • Lee, DongHoon;Kim, Namgyu
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.101-125
    • /
    • 2022
  • Recently, with the development of computing technology and the improvement of the cloud environment, deep learning technology has developed, and attempts to apply deep learning to various fields are increasing. A typical example is anomaly detection, which is a technique for identifying values or patterns that deviate from normal data. Among the representative types of anomaly detection, it is very difficult to detect a contextual anomaly that requires understanding of the overall situation. In general, detection of anomalies in image data is performed using a pre-trained model trained on large data. However, since this pre-trained model was created by focusing on object classification of images, there is a limit to be applied to anomaly detection that needs to understand complex situations created by various objects. Therefore, in this study, we newly propose a two-step pre-trained model for detecting abnormal situation. Our methodology performs additional learning from image captioning to understand not only mere objects but also the complicated situation created by them. Specifically, the proposed methodology transfers knowledge of the pre-trained model that has learned object classification with ImageNet data to the image captioning model, and uses the caption that describes the situation represented by the image. Afterwards, the weight obtained by learning the situational characteristics through images and captions is extracted and fine-tuning is performed to generate an anomaly detection model. To evaluate the performance of the proposed methodology, an anomaly detection experiment was performed on 400 situational images and the experimental results showed that the proposed methodology was superior in terms of anomaly detection accuracy and F1-score compared to the existing traditional pre-trained model.