• Title/Summary/Keyword: Visual and Audio System

Search Result 148, Processing Time 0.025 seconds

A 3D Audio-Visual Animated Agent for Expressive Conversational Question Answering

  • Martin, J.C.;Jacquemin, C.;Pointal, L.;Katz, B.
    • 한국정보컨버전스학회:학술대회논문집
    • /
    • 2008.06a
    • /
    • pp.53-56
    • /
    • 2008
  • This paper reports on the ACQA(Animated agent for Conversational Question Answering) project conducted at LIMSI. The aim is to design an expressive animated conversational agent(ACA) for conducting research along two main lines: 1/ perceptual experiments(eg perception of expressivity and 3D movements in both audio and visual channels): 2/ design of human-computer interfaces requiring head models at different resolutions and the integration of the talking head in virtual scenes. The target application of this expressive ACA is a real-time question and answer speech based system developed at LIMSI(RITEL). The architecture of the system is based on distributed modules exchanging messages through a network protocol. The main components of the system are: RITEL a question and answer system searching raw text, which is able to produce a text(the answer) and attitudinal information; this attitudinal information is then processed for delivering expressive tags; the text is converted into phoneme, viseme, and prosodic descriptions. Audio speech is generated by the LIMSI selection-concatenation text-to-speech engine. Visual speech is using MPEG4 keypoint-based animation, and is rendered in real-time by Virtual Choreographer (VirChor), a GPU-based 3D engine. Finally, visual and audio speech is played in a 3D audio and visual scene. The project also puts a lot of effort for realistic visual and audio 3D rendering. A new model of phoneme-dependant human radiation patterns is included in the speech synthesis system, so that the ACA can move in the virtual scene with realistic 3D visual and audio rendering.

  • PDF

Analysis of learning effects using audio-visual manual of SWAT (SWAT의 시청각 매뉴얼을 통한 학습 효과 분석)

  • Lee, Ju-Yeong;Kim, Tea-Ho;Ryu, Ji-Chul;Kang, Hyun-Woo;Kum, Dong-Hyuk;Woo, Won-Hee;Jang, Chun-Hwa;Choi, Jong-Dae;Lim, Kyoung-Jae
    • Korean Journal of Agricultural Science
    • /
    • v.38 no.4
    • /
    • pp.731-737
    • /
    • 2011
  • In the modern society, GIS-based decision support system has been used in evaluating environmental issues and changes due to spatial and temporal analysis capabilities of the GIS. However without proper manual of these systems, its desired goals could not be achieved. In this study, audio-visual SWAT tutorial system was developed to evaluate its effectives in learning the SWAT model. Learning effects was analyzed after in-class demonstration and survey. The survey was conducted for $3^{rd}$ grade students with/without audio-visual materials using 30 questionnaires, composed of 3 items for trend of respondent, 5 items for effects of audio-visual materials, and 12 items for effects of with/without manual in learning the model. For group without audio-visual manual, 2.98 out of 5 was obtained and 4.05 out of 5 was obtained for group with audio-visual manual, indicating higher content delivery with audio-visual learning effects. As shown in this study, the audio-visual learning material should be developed and used in various computer-based modeling system.

A Novel Integration Scheme for Audio Visual Speech Recognition

  • Pham, Than Trung;Kim, Jin-Young;Na, Seung-You
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.8
    • /
    • pp.832-842
    • /
    • 2009
  • Automatic speech recognition (ASR) has been successfully applied to many real human computer interaction (HCI) applications; however, its performance tends to be significantly decreased under noisy environments. The invention of audio visual speech recognition (AVSR) using an acoustic signal and lip motion has recently attracted more attention due to its noise-robustness characteristic. In this paper, we describe our novel integration scheme for AVSR based on a late integration approach. Firstly, we introduce the robust reliability measurement for audio and visual modalities using model based information and signal based information. The model based sources measure the confusability of vocabulary while the signal is used to estimate the noise level. Secondly, the output probabilities of audio and visual speech recognizers are normalized respectively before applying the final integration step using normalized output space and estimated weights. We evaluate the performance of our proposed method via Korean isolated word recognition system. The experimental results demonstrate the effectiveness and feasibility of our proposed system compared to the conventional systems.

Configuration of Audio-Visual System using Visual Image (이미지를 활용한 오디오-비쥬얼 시스템 구성)

  • Seo, June-Seok;Hong, Sung-Dae;Park, Jin-Wan
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.6
    • /
    • pp.121-129
    • /
    • 2008
  • With the help of information technology The problem of building a concrete form with a shapeless media is the starting point of presenting information by means of sound. Audio-Visual System using sound as a medium is a method that presents an auditory material visually and has a role of linking different sensory organs. In a sense, Audio-Visual System transfers a non-concrete sensation to a concrete one. Audio-Visual System presentation by late methods using active and non-active images produced by computerized irregular procedures can be limited because of the limited ways of visual output. On the other side, visualization using active images can induce more diverse expressions using sound as a medium. This study suggests a new way of expression in animations using visualization of various auditory materials and sounds built by Audio-Visual System with active images.

Human-Robot Interaction in Real Environments by Audio-Visual Integration

  • Kim, Hyun-Don;Choi, Jong-Suk;Kim, Mun-Sang
    • International Journal of Control, Automation, and Systems
    • /
    • v.5 no.1
    • /
    • pp.61-69
    • /
    • 2007
  • In this paper, we developed not only a reliable sound localization system including a VAD(Voice Activity Detection) component using three microphones but also a face tracking system using a vision camera. Moreover, we proposed a way to integrate three systems in the human-robot interaction to compensate errors in the localization of a speaker and to reject unnecessary speech or noise signals entering from undesired directions effectively. For the purpose of verifying our system's performances, we installed the proposed audio-visual system in a prototype robot, called IROBAA(Intelligent ROBot for Active Audition), and demonstrated how to integrate the audio-visual system.

Robust Person Identification Using Optimal Reliability in Audio-Visual Information Fusion

  • Tariquzzaman, Md.;Kim, Jin-Young;Na, Seung-You;Choi, Seung-Ho
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3E
    • /
    • pp.109-117
    • /
    • 2009
  • Identity recognition in real environment with a reliable mode is a key issue in human computer interaction (HCI). In this paper, we present a robust person identification system considering score-based optimal reliability measure of audio-visual modalities. We propose an extension of the modified reliability function by introducing optimizing parameters for both of audio and visual modalities. For degradation of visual signals, we have applied JPEG compression to test images. In addition, for creating mismatch in between enrollment and test session, acoustic Babble noises and artificial illumination have been added to test audio and visual signals, respectively. Local PCA has been used on both modalities to reduce the dimension of feature vector. We have applied a swarm intelligence algorithm, i.e., particle swarm optimization for optimizing the modified convection function's optimizing parameters. The overall person identification experiments are performed using VidTimit DB. Experimental results show that our proposed optimal reliability measures have effectively enhanced the identification accuracy of 7.73% and 8.18% at different illumination direction to visual signal and consequent Babble noises to audio signal, respectively, in comparison with the best classifier system in the fusion system and maintained the modality reliability statistics in terms of its performance; it thus verified the consistency of the proposed extension.

A Study on the Extension of the Description Elements for Audio-visual Archives (시청각기록물의 기술요소 확장에 관한 연구)

  • Nam, Young-Joon;Moon, Jung-Hyun
    • Journal of the Korean BIBLIA Society for library and Information Science
    • /
    • v.21 no.4
    • /
    • pp.67-80
    • /
    • 2010
  • The output and usage rate of audio-visual materials have sharply increased as the information industry advances and diverse archives became available. However, the awareness of the audio-visual archives are more of a separate record with collateral value. The organizations that hold these materials have very weak system of the various areas such as the categories and archiving methods. Moreover, the management system varies among the organizations, so the users face difficulty retrieving and utilizing the audio-visual materials. Thus, this study examined the feasibility of the synchronized management of audio-visual archives by comparing the descriptive elements of the audio-visual archives in internal key agencies. The study thereby examines the feasibility of the metadata element of the organizations and that of synchronized management to propose the effect of the use of management, retrieval and service of efficient AV materials. The study also proposes the improvement of descriptive element of metadata.

Constructing a Noise-Robust Speech Recognition System using Acoustic and Visual Information (청각 및 시가 정보를 이용한 강인한 음성 인식 시스템의 구현)

  • Lee, Jong-Seok;Park, Cheol-Hoon
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.13 no.8
    • /
    • pp.719-725
    • /
    • 2007
  • In this paper, we present an audio-visual speech recognition system for noise-robust human-computer interaction. Unlike usual speech recognition systems, our system utilizes the visual signal containing speakers' lip movements along with the acoustic signal to obtain robust speech recognition performance against environmental noise. The procedures of acoustic speech processing, visual speech processing, and audio-visual integration are described in detail. Experimental results demonstrate the constructed system significantly enhances the recognition performance in noisy circumstances compared to acoustic-only recognition by using the complementary nature of the two signals.

A Case Study of the Audio-Visual Archives System Development and Management (시청각(사진/동영상) 기록물 관리를 위한 시스템 구축과 운영 사례 연구)

  • Shin, Dong-Hyeon;Jung, Se-Young;Kim, Seon-Heon
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.9 no.1
    • /
    • pp.33-50
    • /
    • 2009
  • ADD(Agency for Defense Development) has developed digital audio-visual archives management system to ensure easy access and long-term preservation for digital audio-visual archives. This paper covers total process of the system development and database management in the aspect of preservation and utilization by users' easy search through digitization of audio-visual archives. In detail, it contains system design for images and video data handling, standard workflow establishment, data quality, and metadata settings for database by converting an analog data into digital format. Also, this study emphasizes the importance of audio-visual archives management system through cost-effectiveness analysis.

A study on the Development of a Driving Simulator for Reappearance of Vehicle Motion (I) (차량 주행 감각 재현을 위한 운전 시뮬레이터 개발에 관한 연구 (I))

  • Park, Min-Kyu;Lee, Min-Cheol;Son, Kwon;Yoo, Wan-Suk;Han, Myung-Chul;Lee, Jang-Myung
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.16 no.6
    • /
    • pp.90-99
    • /
    • 1999
  • A vehicle driving simulator is a virtual reality device which a human being feels as if the one drives a vehicle actually. The driving simulator is used effectively for studying interaction of a driver-vehicle and developing vehicle system of a new concept. The driving simulator consists of a vehicle motion bed system, motion controller, visual and audio system, vehicle dynamic analysis system, cockpit system, and etc. In it is paper, the main procedures to develop the driving simulator are classified by five parts. First, a motion bed system and a motion controller, which can track a reference trajectory, are developed. Secondly, a performance evaluation of the motion bed system for the driving simulator is carried out using LVDTs and accelerometers. Thirdly, a washout algorithm to realize a motion of an actual vehicle in the driving simulator is developed. The algorithm changes the motion space of a vehicle into the workspace of the driving simulator. Fourthly, a visual and audio system for feeling higher realization is developed. Finally, an integration system to communicate and monitor between sub systems is developed.

  • PDF