• 제목/요약/키워드: Audio-visual interaction

검색결과 29건 처리시간 0.029초

Human-Robot Interaction in Real Environments by Audio-Visual Integration

  • Kim, Hyun-Don;Choi, Jong-Suk;Kim, Mun-Sang
    • International Journal of Control, Automation, and Systems
    • /
    • 제5권1호
    • /
    • pp.61-69
    • /
    • 2007
  • In this paper, we developed not only a reliable sound localization system including a VAD(Voice Activity Detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate three systems in the human-robot interaction to compensate errors in the localization of a speaker and to reject unnecessary speech or noise signals entering from undesired directions effectively. For the purpose of verifying our system's performances, we installed the proposed audio-visual system in a prototype robot, called IROBAA(Intelligent ROBot for Active Audition), and demonstrated how to integrate the audio-visual system.

시청각 상호작용과 멀티미디어 시대의 디자인교육 (Audio-visual Interaction and Design-education in the Age of Multimedia)

  • 서계숙
    • 디자인학연구
    • /
    • 제14권3호
    • /
    • pp.49-58
    • /
    • 2001
  • 멀티미디어 시대의 커뮤니케이션 디자이너는 색채, 형태, 시간, 움직임과 같은 시각적인 요소뿐만 아니라 사운드까지도 메시지를 전달하는 표현요소로서 인식하여야 한다. 잘 알려졌다시피 시각이나 청각 어느 하나만으로 메시지를 전달할 때보다 시청각을 조화시켜 메시지를 전달할 경우 인지도가 높기 때문이다. 시각과 청각의 만남은 공감각에 근거하는데 이것은 색채와 음, 형태와 사운드의 연상작용으로 나타난다. 기초적인 예를 들면 낮은 음은 어두운 색을 연상시키며 높은 음은 밝은 색을 연상시킨다. 또 타악기는 원을, 멜로디는 선을 연상시킨다. 멀티미디어에서 시각적 요소와 청각적 요소는 이전의 시청각매체에서와 달리 단순히 보이는 장면과 관련된 소리를 들려주는 동시성의 수준에서 벗어나 각각 독립적인 표현요소로 작용하여야 한다 이렇게 독립적인 표현요소로서의 시각과 청각이 만나서 상호작용을 일으킴으로써 그 중 어느 하나만으로는 도달할 수 없는 새로운 감동을 불러일으킬 수 있는 것이다. 멀티미디어 시대의 디자인 교육은 이렇게 시각과 청각의 상호작용의 원리를 이해하고 메시지를 시청각적으로 표현할 수 있는 능력을 개발하는 교육 프로그램을 필요로 한다. 본 논문에서는 이러한 교육 프로그램을 시청각 조형, 시청각 구성, 시청각 디자인으로 구분하여 구체적인 과제들을 예로서 제시하였다.

  • PDF

A Novel Integration Scheme for Audio Visual Speech Recognition

  • Pham, Than Trung;Kim, Jin-Young;Na, Seung-You
    • 한국음향학회지
    • /
    • 제28권8호
    • /
    • pp.832-842
    • /
    • 2009
  • Automatic speech recognition (ASR) has been successfully applied to many real human computer interaction (HCI) applications; however, its performance tends to be significantly decreased under noisy environments. The invention of audio visual speech recognition (AVSR) using an acoustic signal and lip motion has recently attracted more attention due to its noise-robustness characteristic. In this paper, we describe our novel integration scheme for AVSR based on a late integration approach. Firstly, we introduce the robust reliability measurement for audio and visual modalities using model based information and signal based information. The model based sources measure the confusability of vocabulary while the signal is used to estimate the noise level. Secondly, the output probabilities of audio and visual speech recognizers are normalized respectively before applying the final integration step using normalized output space and estimated weights. We evaluate the performance of our proposed method via Korean isolated word recognition system. The experimental results demonstrate the effectiveness and feasibility of our proposed system compared to the conventional systems.

시청각 자극의 시간적 인지 판단 (Temporal-perceptual Judgement of Visuo-Auditory Stimulation)

  • 유미;이상민;박용군;권대규;김남균
    • 한국정밀공학회지
    • /
    • 제24권1호
    • /
    • pp.101-109
    • /
    • 2007
  • In situations of spatio-temporal perception about visuo-auditory stimulus, researches propose optimal integration hypothesis that perceptual process is optimized to the interaction of the senses for the precision of perception. So, when the visual information considered generally dominant over any other sense is ambiguous, the information of the other sense like auditory stimulus influences the perceptual process in interaction with visual information. Thus, we performed two different experiments to certain the conditions of the interacting senses and influence of the condition. We consider the interaction of the visuo-auditory stimulation in the free space, the color of visual stimulus and sex difference of testee with normal people. In first experiment, 12 participants were asked to judge the change in the frequency of audio-visual stimulation using a visual flicker and auditory flutter stimulation in the free space. When auditory temporal cues were presented, the change in the frequency of the visual stimulation was associated with a perceived change in the frequency of the auditory stimulation as the results of the previous studies using headphone. In second experiment, 30 male and 30 female were asked to judge the change in the frequency of audio-visual stimulation using a color of visual flicker and auditory flutter stimulation. In the color condition using red and green. Both male and female testees showed same perceptual tendency. male and female testees showed same perceptual tendency however, in case of female, the standard deviation is larger than that of male. This results implies that audio-visual asymmetry effects are influenced by the cues of visual and auditory information, such as the orientation between auditory and visual stimulus, the color of visual stimulus.

Robust Person Identification Using Optimal Reliability in Audio-Visual Information Fusion

  • Tariquzzaman, Md.;Kim, Jin-Young;Na, Seung-You;Choi, Seung-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • 제28권3E호
    • /
    • pp.109-117
    • /
    • 2009
  • Identity recognition in real environment with a reliable mode is a key issue in human computer interaction (HCI). In this paper, we present a robust person identification system considering score-based optimal reliability measure of audio-visual modalities. We propose an extension of the modified reliability function by introducing optimizing parameters for both of audio and visual modalities. For degradation of visual signals, we have applied JPEG compression to test images. In addition, for creating mismatch in between enrollment and test session, acoustic Babble noises and artificial illumination have been added to test audio and visual signals, respectively. Local PCA has been used on both modalities to reduce the dimension of feature vector. We have applied a swarm intelligence algorithm, i.e., particle swarm optimization for optimizing the modified convection function's optimizing parameters. The overall person identification experiments are performed using VidTimit DB. Experimental results show that our proposed optimal reliability measures have effectively enhanced the identification accuracy of 7.73% and 8.18% at different illumination direction to visual signal and consequent Babble noises to audio signal, respectively, in comparison with the best classifier system in the fusion system and maintained the modality reliability statistics in terms of its performance; it thus verified the consistency of the proposed extension.

청각 및 시가 정보를 이용한 강인한 음성 인식 시스템의 구현 (Constructing a Noise-Robust Speech Recognition System using Acoustic and Visual Information)

  • 이종석;박철훈
    • 제어로봇시스템학회논문지
    • /
    • 제13권8호
    • /
    • pp.719-725
    • /
    • 2007
  • In this paper, we present an audio-visual speech recognition system for noise-robust human-computer interaction. Unlike usual speech recognition systems, our system utilizes the visual signal containing speakers' lip movements along with the acoustic signal to obtain robust speech recognition performance against environmental noise. The procedures of acoustic speech processing, visual speech processing, and audio-visual integration are described in detail. Experimental results demonstrate the constructed system significantly enhances the recognition performance in noisy circumstances compared to acoustic-only recognition by using the complementary nature of the two signals.

오디오 기반 SNS의 인터페이스 디자인 요소 연구 (A Study on the Elements of Interface Design of Audio-based Social Networking Service)

  • 김연수;최종훈
    • 한국융합학회논문지
    • /
    • 제13권2호
    • /
    • pp.143-150
    • /
    • 2022
  • 오디오 기반 SNS 또한 사용자가 원하는 콘텐츠에 도달하기 위한 시각적 가이드가 필요하다. 이에 본 연구는 오디오 기반 SNS에서 오디오 콘텐츠의 사용 경험에 영향을 미치는 시각적 인터페이스 디자인 요소에 대해 연구하였다. 선행연구를 통해 기존의 일반적인 인터페이스 디자인 요소가 오디오 콘텐츠의 사용성에 중요함을 파악하였다. 현재 출시된 오디오 기반 SNS의 분석을 통해 기존 인터페이스 요소의 의미와 영향을 확인하였고, 기타 오디오 콘텐츠 서비스의 분석을 통해 오디오 SNS에 있어 고려할 새로운 기준의 인터페이스 평가 속성을 도출하였다. 이에 일반적인 다섯 가지 인터페이스 평가 요소인 레이아웃, 컬러, 아이콘, 타이포그래피, 그래픽 이미지에 멀티미디어 요소를 새롭게 정의하며, 오디오 기반 SNS의 UI를 고려할 요소로 제안한다.

차량 시스템 개발 및 운전자 인자 연구를 위한 실시간 차량 시뮬레이터의 개발 (Development of a Real-Time Driving Simulator for Vehicle System Development and Human Factor Study)

  • 이승준
    • 한국자동차공학회논문집
    • /
    • 제7권7호
    • /
    • pp.250-257
    • /
    • 1999
  • Driving simulators are used effectively for human factor study, vehicle system development and other purposes by enabling to reproduce actural driving conditions in a safe and tightly controlled enviornment. Interactive simulation requries appropriate sensory and stimulus cuing to the driver . Sensory and stimulus feedback can include visual , auditory, motion, and proprioceptive cues. A fixed-base driving simulator has been developed in this study for vehicle system developmnet and human factor study . The simulator consists of improved and synergistic subsystems (a real-time vehicle simulation system, a visual/audio system and a control force loading system) based on the motion -base simulator, KMU DS-Ⅰ developed for design and evaluation of a full-scale driving simulator and for driver-vehicle interaction.

  • PDF

차량 주행 감각 재현을 위한 운전 시뮬레이터 개발에 관한 연구 (I) (A study on the Development of a Driving Simulator for Reappearance of Vehicle Motion (I))

  • 박민규;이민철;손권;유완석;한명철;이장명
    • 한국정밀공학회지
    • /
    • 제16권6호
    • /
    • pp.90-99
    • /
    • 1999
  • A vehicle driving simulator is a virtual reality device which a human being feels as if the one drives a vehicle actually. The driving simulator is used effectively for studying interaction of a driver-vehicle and developing vehicle system of a new concept. The driving simulator consists of a vehicle motion bed system, motion controller, visual and audio system, vehicle dynamic analysis system, cockpit system, and etc. In it is paper, the main procedures to develop the driving simulator are classified by five parts. First, a motion bed system and a motion controller, which can track a reference trajectory, are developed. Secondly, a performance evaluation of the motion bed system for the driving simulator is carried out using LVDTs and accelerometers. Thirdly, a washout algorithm to realize a motion of an actual vehicle in the driving simulator is developed. The algorithm changes the motion space of a vehicle into the workspace of the driving simulator. Fourthly, a visual and audio system for feeling higher realization is developed. Finally, an integration system to communicate and monitor between sub systems is developed.

  • PDF