• Title/Summary/Keyword: Multimodal interaction

Search Result 57, Processing Time 0.03 seconds

Design and Implementation of a User Activity Auto-recognition System based on Multimodal Sensor in Ubiquitous Computing Environment (유비쿼터스 컴퓨팅환경에서의 Multimodal Sensor 기반의 Health care를 위한 사용자 행동 자동인식 시스템 - Multi-Sensor를 이용한 ADL(activities of daily living) 지수 자동 측정 시스템)

  • Byun, Sung-Ho;Jung, Yu-Suk;Kim, Tae-Su;Kim, Hyun-Woo;Lee, Seung-Hwan;Cho, We-Duke
    • 한국HCI학회:학술대회논문집
    • /
    • 2009.02a
    • /
    • pp.21-26
    • /
    • 2009
  • A sensor system capable of automatically recognize activities would allow many potential Ubiquitous applications. This paper presents a new system for recognizing the activities of daily living(ADL) like walking, running, standing, sitting, lying etc. The system based on the state-dependent motion analysis using Tri-Accelerometer and Zigbee tag. Two accelerometers are used for the classification of body and hand activities. Classification of the environment and instrumental activities is performed based on the hand interaction with an object ID using.

  • PDF

A study of effective contents construction for AR based English learning (AR기반 영어학습을 위한 효과적 콘텐츠 구성 방향에 대한 연구)

  • Kim, Young-Seop;Jeon, Soo-Jin;Lim, Sang-Min
    • Journal of The Institute of Information and Telecommunication Facilities Engineering
    • /
    • v.10 no.4
    • /
    • pp.143-147
    • /
    • 2011
  • The system using augmented reality can save the time and cost. It is verified in various fields under the possibility of a technology by solving unrealistic feeling in the virtual space. Therefore, augmented reality has a variety of the potential to be used. Generally, multimodal senses such as visual/auditory/tactile feed back are well known as a method for enhancing the immersion in case of interaction with virtual object. By adapting tangible object we can provide touch sensation to users. a 3D model of the same scale overlays the whole area of the tangible object; thus, the marker area is invisible. This contributes to enhancing immersive and natural images to users. Finally, multimodal feedback also creates better immersion. In this paper, sound feedback is considered. By further improving immersion learning augmented reality for children with the initial step learning content is presented. Augmented reality is in the intermediate stages between future world and real world as well as its adaptability is estimated more than virtual reality.

  • PDF

Interface Modeling for Digital Device Control According to Disability Type in Web

  • Park, Joo Hyun;Lee, Jongwoo;Lim, Soon-Bum
    • Journal of Multimedia Information System
    • /
    • v.7 no.4
    • /
    • pp.249-256
    • /
    • 2020
  • Learning methods using various assistive and smart devices have been developed to enable independent learning of the disabled. Pointer control is the most important consideration for the disabled when controlling a device and the contents of an existing graphical user interface (GUI) environment; however, difficulties can be encountered when using a pointer, depending on the disability type; Although there are individual differences depending on the blind, low vision, and upper limb disability, problems arise in the accuracy of object selection and execution in common. A multimodal interface pilot solution is presented that enables people with various disability types to control web interactions more easily. First, we classify web interaction types using digital devices and derive essential web interactions among them. Second, to solve problems that occur when performing web interactions considering the disability type, the necessary technology according to the characteristics of each disability type is presented. Finally, a pilot solution for the multimodal interface for each disability type is proposed. We identified three disability types and developed solutions for each type. We developed a remote-control operation voice interface for blind people and a voice output interface applying the selective focusing technique for low-vision people. Finally, we developed a gaze-tracking and voice-command interface for GUI operations for people with upper-limb disability.

Multimodal audiovisual speech recognition architecture using a three-feature multi-fusion method for noise-robust systems

  • Sanghun Jeon;Jieun Lee;Dohyeon Yeo;Yong-Ju Lee;SeungJun Kim
    • ETRI Journal
    • /
    • v.46 no.1
    • /
    • pp.22-34
    • /
    • 2024
  • Exposure to varied noisy environments impairs the recognition performance of artificial intelligence-based speech recognition technologies. Degraded-performance services can be utilized as limited systems that assure good performance in certain environments, but impair the general quality of speech recognition services. This study introduces an audiovisual speech recognition (AVSR) model robust to various noise settings, mimicking human dialogue recognition elements. The model converts word embeddings and log-Mel spectrograms into feature vectors for audio recognition. A dense spatial-temporal convolutional neural network model extracts features from log-Mel spectrograms, transformed for visual-based recognition. This approach exhibits improved aural and visual recognition capabilities. We assess the signal-to-noise ratio in nine synthesized noise environments, with the proposed model exhibiting lower average error rates. The error rate for the AVSR model using a three-feature multi-fusion method is 1.711%, compared to the general 3.939% rate. This model is applicable in noise-affected environments owing to its enhanced stability and recognition rate.

A Deep Learning Based Approach to Recognizing Accompanying Status of Smartphone Users Using Multimodal Data (스마트폰 다종 데이터를 활용한 딥러닝 기반의 사용자 동행 상태 인식)

  • Kim, Kilho;Choi, Sangwoo;Chae, Moon-jung;Park, Heewoong;Lee, Jaehong;Park, Jonghun
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.163-177
    • /
    • 2019
  • As smartphones are getting widely used, human activity recognition (HAR) tasks for recognizing personal activities of smartphone users with multimodal data have been actively studied recently. The research area is expanding from the recognition of the simple body movement of an individual user to the recognition of low-level behavior and high-level behavior. However, HAR tasks for recognizing interaction behavior with other people, such as whether the user is accompanying or communicating with someone else, have gotten less attention so far. And previous research for recognizing interaction behavior has usually depended on audio, Bluetooth, and Wi-Fi sensors, which are vulnerable to privacy issues and require much time to collect enough data. Whereas physical sensors including accelerometer, magnetic field and gyroscope sensors are less vulnerable to privacy issues and can collect a large amount of data within a short time. In this paper, a method for detecting accompanying status based on deep learning model by only using multimodal physical sensor data, such as an accelerometer, magnetic field and gyroscope, was proposed. The accompanying status was defined as a redefinition of a part of the user interaction behavior, including whether the user is accompanying with an acquaintance at a close distance and the user is actively communicating with the acquaintance. A framework based on convolutional neural networks (CNN) and long short-term memory (LSTM) recurrent networks for classifying accompanying and conversation was proposed. First, a data preprocessing method which consists of time synchronization of multimodal data from different physical sensors, data normalization and sequence data generation was introduced. We applied the nearest interpolation to synchronize the time of collected data from different sensors. Normalization was performed for each x, y, z axis value of the sensor data, and the sequence data was generated according to the sliding window method. Then, the sequence data became the input for CNN, where feature maps representing local dependencies of the original sequence are extracted. The CNN consisted of 3 convolutional layers and did not have a pooling layer to maintain the temporal information of the sequence data. Next, LSTM recurrent networks received the feature maps, learned long-term dependencies from them and extracted features. The LSTM recurrent networks consisted of two layers, each with 128 cells. Finally, the extracted features were used for classification by softmax classifier. The loss function of the model was cross entropy function and the weights of the model were randomly initialized on a normal distribution with an average of 0 and a standard deviation of 0.1. The model was trained using adaptive moment estimation (ADAM) optimization algorithm and the mini batch size was set to 128. We applied dropout to input values of the LSTM recurrent networks to prevent overfitting. The initial learning rate was set to 0.001, and it decreased exponentially by 0.99 at the end of each epoch training. An Android smartphone application was developed and released to collect data. We collected smartphone data for a total of 18 subjects. Using the data, the model classified accompanying and conversation by 98.74% and 98.83% accuracy each. Both the F1 score and accuracy of the model were higher than the F1 score and accuracy of the majority vote classifier, support vector machine, and deep recurrent neural network. In the future research, we will focus on more rigorous multimodal sensor data synchronization methods that minimize the time stamp differences. In addition, we will further study transfer learning method that enables transfer of trained models tailored to the training data to the evaluation data that follows a different distribution. It is expected that a model capable of exhibiting robust recognition performance against changes in data that is not considered in the model learning stage will be obtained.

User's Emotional Touch Recognition Interface Using non-contact Touch Sensor and Accelerometer (비접촉식 터치센서와 가속도센서를 이용한 사용자의 감정적 터치 인식 인터페이스 시스템)

  • Koo, Seong-Yong;Lim, Jong-Gwan;Kwon, Dong-Soo
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.348-353
    • /
    • 2008
  • This paper proposes a novel touch interface for recognizing user's touch pattern and understanding emotional information by eliciting natural user interaction. To classify physical touches, we represent the similarity between touches by analyzing touches based on its dictionary meaning and design the algorithm to recognize various touch patterns in real time. Finally we suggest the methodology to estimate user's emotional state based on touch.

  • PDF

Seamless 2D/3D Interaction System using a Tangible Object (감각형 객체를 이용한 이음매 없는 2D/3D 상호작용 시스템)

  • Na, Se-Won;Ha, Tae-Jin;Woo, Woon-Tack
    • 한국HCI학회:학술대회논문집
    • /
    • 2007.02b
    • /
    • pp.264-269
    • /
    • 2007
  • 본 논문에서는 감각형 객체(Tangible Object)를 이용한 테이블에서의 2D/3D 상호작용 시스템을 제안한다. 제안된 시스템은 기존의 ARTable[1]에 직육면체 형태의 감각형 객체와 카메라가 장착된 이동형 모니터를 추가하여 제작되었다. 감각형 객체는 모든 면에 ARToolkit[3]에서 쓰이는 마커가 부착되어 있으며, 내부에는 진동자와 불루투스 통신 모듈이 삽입되어 있다. 또한 카메라가 달린 모니터는 모니터 암에 연결되어 사용자가 이동하며 ARTable 상판을 관측할 수 있도록 부착되어 있다. 이 시스템를 이용하여 사용자는 디스플레이형 테이블인 ARTable 위에서 가상공간을 네비게이션(2D 상호작용)할 때 정확한 길을 찾아가기 위한 도움을 받을 수 있을 뿐만 아니라, 증강현실 환경에서 가상객체와 3D 상호작용을 할 수 있다. 또한 진동 모듈과 이를 제어하기 위한 블루투스 모듈이 내장 되어있어, 특정한 이벤트 발생시 진동자를 이용하여 사용자에게 촉각 감응 효과를 줄 수 있다. 제안된 시스템은 교육, 엔터테이먼트, 등 다양한 분야에서 사용될 수 있다.

  • PDF

A Review of Haptic Perception: Focused on Sensation and Application

  • Song, Joobong;Lim, Ji Hyoun;Yun, Myung Hwan
    • Journal of the Ergonomics Society of Korea
    • /
    • v.31 no.6
    • /
    • pp.715-723
    • /
    • 2012
  • Objective: The aim of this study is to investigate haptic perception related researches into three perspectives: cutaneous & proprioceptive sensations, active & passive touch, and cognition & emotion, then to identify issues for implementing haptic interactions. Background: Although haptic technologies had improved and become practical, more research on the method of application is still needed to actualize the multimodal interaction technology. Systematical approached to explore haptic perception is required to understand emotional experience and social message, as well as tactile feedback. Method: Content analysis were conducted to analyze trend in haptic related research. Changes in issues and topics were investigated using sensational dimensions and the different contents delivered via tactile perception. Result: The found research opportunities were haptic perception in various body segments and emotion related proprioceptive sensation. Conclusion: Once the mechanism of how users perceives haptic stimuli would help to develop effective haptic interactrion and this study provide insights of what to focus for the future of haptic interaction. Application: This research is expected to provide presence, and emotional response applied by haptic perception to fields such as human-robot, human-device, and telecommunication interaction.

Automatic Adaptation Based Metaverse Virtual Human Interaction (자동 적응 기반 메타버스 가상 휴먼 상호작용 기법)

  • Chung, Jin-Ho;Jo, Dongsik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.2
    • /
    • pp.101-106
    • /
    • 2022
  • Recently, virtual human has been widely used in various fields such as education, training, information guide. In addition, it is expected to be applied to services that interact with remote users in metaverse. In this paper, we propose a novel method to make a virtual human' interaction to perceive the user's surroundings. We use the editing authoring tool to apply user's interaction for providing the virtual human's response. The virtual human can recognize users' situations based on fuzzy, present optimal response to users. With our interaction method by context awareness to address our paper, the virtual human can provide interaction suitable for the surrounding environment based on automatic adaptation.

Effects of a Short-term Multimodal Group Intervention Program on Cognitive Function and Depression of the Elderly (단기 집단 복합중재가 정상 노인의 인지기능 및 우울에 미치는 영향)

  • Jung, Beom-Jin;Choi, Yu-Jin
    • Therapeutic Science for Rehabilitation
    • /
    • v.8 no.3
    • /
    • pp.57-68
    • /
    • 2019
  • Purpose: This study aimed to investigate the effects of a short-term group multimodal intervention program that mixes physical activity, cognitive motion, and social interaction, on the cognitive function and depression level of healthy over 75-year-old individuals. Method: This study used a one group pre-test-post-test design, and intervention was made for 70 minutes per session, once a week, for four sessions in total. To compare changes in cognitive function, depression level and physical function before and after intervention, this study used the Mini-Mental State Examination-Dementia Screening (MMSE-DS), Geriatric Depression Scale-Short Form (GDS-SF), and Berg Balance Scale (BBS). Result: After applying group multimodal interventions to healthy over 75-year-old individuals, there was a statistically significant improvement in their cognitive function (p < 0.01), and there was a statistically significant decrease in their depression level (p < 0.05). Also, there was an increase in the rating score of the degree of balance from $46.83{\pm}9.11$ points before the intervention, to $48.08{\pm}7.00$ points after the intervention; however, it was not statistically significant (p > 0.05). Conclusion: Short-term group multimodal intervention that mixes physical activity, cognitive motion, and social interaction had a significant effect on slowing down the deterioration of cognitive function in healthy over 75 year-old individuals, and decreased their depression level. This study is significant in that it presents a foundation for providing more systematic intervention for the prevention of dementia and depression in the healthy older individuals. Follow-up studies should verify the result through research on the effects of an occupational therapist's professional treatment, and experimental group-control research.